Jan 29 14:04:40 <tflink>  the speaker is David Malcom, packager of python for Fedora
Jan 29 14:05:11 <tflink>  he is planning to talk about different species of python (jython, cython etc.)
Jan 29 14:05:35 <tflink>  seems to be a technical audience (experience with python, Java, fedora packaging etc)
Jan 29 14:05:46 <DiscordianUK>  it probably would be
Jan 29 14:05:52 <DiscordianUK>  pypy etc
Jan 29 14:06:09 <tflink>  * attempting to get his laptop to behave and show the slides properly
Jan 29 14:06:25 <tflink>  So why do we care about the different species of python?
Jan 29 14:06:59 <tflink>  * interruption for request to transcribe the presentation
Jan 29 14:07:27 *  bcl (~bcl@neil.brianlane.com) has joined #fudcon-room-3
Jan 29 14:07:31 <tflink>  * interruption completed
Jan 29 14:07:49 <tflink>  slides will be uploaded to David Malcom's fedora people page once he has internet access
Jan 29 14:07:57 <tflink>  so why do we care about different species of python?
Jan 29 14:08:17 <tflink>  intellectually interested in different implementatations, different strengths/weaknesses
Jan 29 14:08:25 <tflink>  memory usage, debugging ability, etc/.
Jan 29 14:08:53 <tflink>  also interacting with other technologies (ie jython for interacting with java)
Jan 29 14:09:18 <tflink>  Doesn't assert that there is a single best implementation of python - they all have their strengths and best places
Jan 29 14:09:23 <tflink>  so what is python for?
Jan 29 14:09:30 <tflink>   - one off scripts
Jan 29 14:09:37 <herlo>  tflink: thanks
Jan 29 14:09:44 <tflink>   - simple hacks that can be changed into something long-term
Jan 29 14:10:05 <tflink>   - highly readable high-level language
Jan 29 14:10:18 <tflink>   - Python is "Batteries Included"
Jan 29 14:10:33 <tflink>  * feel free to ask questions (even remote) - I will try to relay
Jan 29 14:10:54 <tflink>  Python can also be used as glue code for bridging libraries with high level code
Jan 29 14:11:08 <tflink>  sometimes, the linux community is too independant - won't accept a common runtime
Jan 29 14:11:23 *  jsmith-mobile (95a98657@gateway/web/freenode/ip.149.169.134.87) has joined #fudcon-room-3
Jan 29 14:11:42 <tflink>  python can be used as something as a "common runtime" (as much as anything)
Jan 29 14:12:07 <tflink>  Since python can be easily plugged into c++, easy to use with gdb
Jan 29 14:12:14 <tflink>  easy to bind to C libs is a strength
Jan 29 14:12:23 <tflink>  So where is python used in Fedora?
Jan 29 14:12:37 *  rdieter (~foo@fedora/rdieter) has joined #fudcon-room-3
Jan 29 14:12:46 <tflink>   powers *.fedoraproject.org
Jan 29 14:13:08 <tflink>  also used by TurboGears, Django, other apps (koji et. al.)
Jan 29 14:13:23 <tflink>  Fedora infrastructure does use some Django, but it is minimal
Jan 29 14:13:46 <tflink>  So we have all these possible uses of python (glue code, web development, simple scripts ...)
Jan 29 14:14:05 <tflink>  -> "Python" vs "CPython"
Jan 29 14:14:10 <tflink>  Python -> language
Jan 29 14:14:31 <tflink>  CPython -> what most people think of as python (generally /usr/bin/python" and the original implementation
Jan 29 14:14:42 *  DiscordianUK nods
Jan 29 14:15:37 <tflink>  * missed the bullets on slide about kloc in sections of CPython
Jan 29 14:15:55 <tflink>  CPython's object system
Jan 29 14:15:58 <DiscordianUK>  many klocs I'm sure
Jan 29 14:16:24 <tflink>  Cpython is a implementation is C and has objects and types hand-coded in C
Jan 29 14:16:34 <tflink>  Objects are .c structs with a ref count
Jan 29 14:16:57 <tflink>  references between objects are just .c pointers -> objects can't move around in memory
Jan 29 14:17:13 *  Cerlyn (~Cerlyn@66.87.11.113) has joined #fudcon-room-3
Jan 29 14:17:51 <tflink>  there is one big mutex in python (for counting references, if I heard correctly)
Jan 29 14:18:05 <tflink>  * question about a patch by google to remove that mutex
Jan 29 14:18:13 <bcl>  The Global Interpreter Lock (GIL)
Jan 29 14:18:26 <tflink>  there was an attempt to remove the mutex in the past (0.99 era?) but it failed
Jan 29 14:18:58 <tflink>  * transcribers note - sorry, I'm having a bit of trouble keeping up. missing a little bit
Jan 29 14:19:23 <DiscordianUK>  thanks for what you are doing
Jan 29 14:19:25 <tflink>  the other issue with CPython is reference counting
Jan 29 14:19:54 <tflink>  these pointers are being passed around by hand, and its easy to get wrong
Jan 29 14:20:08 <tflink>  can end up with memory leaks, segfaults and other hard to debug situations
Jan 29 14:20:12 <tflink>  but on the other hand, it is simple
Jan 29 14:20:23 *  burriedu2 (95a9ac77@gateway/web/freenode/ip.149.169.172.119) has joined #fudcon-room-3
Jan 29 14:20:32 <tflink>  the next part of CPython is the interpreter
Jan 29 14:20:51 <tflink>  python compiles the code down to bytecode which is a series of simple operations
Jan 29 14:21:35 <tflink>  tha.py files are turned into a syntax tree that are turned into instructions that are on the "Fake" CPU and some operations are collapsed into just data
Jan 29 14:21:54 <tflink>  example: if using the len() function, it is possible to redefine that
Jan 29 14:22:19 <tflink>  also possible to redefine stuff like true and false, so its hard to do traditional optimizations at compile time
Jan 29 14:22:36 <tflink>  * question - is byte code consistant between implementations?
Jan 29 14:22:46 <tflink>  no, the bytecode is not consistant between the implementations
Jan 29 14:23:20 <DiscordianUK>  that's a fail then
Jan 29 14:23:25 <tflink>  there is a marker in the .pyc that identifies the version of the bytecode generated and a timestamp of the associated .py file
Jan 29 14:23:54 <tflink>  so when you compile a .py file to .pyc, its kind of like a make file
Jan 29 14:24:12 <tflink>  when you run a .py, the bytecode has to exactly match the runtime, or else it will be recompiled
Jan 29 14:24:49 <tflink>  but the bytecode generally stays consistent between updates of the same version (ex. python2.7 versions all have the same bytecode "magic number")
Jan 29 14:25:02 <tflink>  but the bytecode number could change between development versions
Jan 29 14:25:30 <tflink>  * missed the question
Jan 29 14:25:44 <DiscordianUK>  ahhh so the bytecode is consistent across OSes?
Jan 29 14:26:03 <tflink>  the problem is that the .pyc files were generally living next to the .py files, but there is a proposal to change that
Jan 29 14:26:17 <tflink>  the new proposal would have a separate directory for bytecode
Jan 29 14:26:26 <tflink>  in a .pycache directory
Jan 29 14:26:38 <tflink>  that would have a dir for each bytecode version
Jan 29 14:27:13 <tflink>  DiscourdianUK: the bytecode should be consistent across OSs
Jan 29 14:27:20 <tflink>  the important variable is the runtime
Jan 29 14:28:06 <tflink>  so the opcodes (and the byte code) will change between pypy, CPython, Jython, IronPython etc.
Jan 29 14:28:16 <tflink>  but should stay the same for all versions of CPython 2.6 etc.
Jan 29 14:28:36 <tflink>  * quesion - are the python version reverse compatible (can you run 2.3 bytecode on a 2.6 interperter)
Jan 29 14:28:53 <tflink>  no, you can't do that because they may have removed opcodes or added opcodes
Jan 29 14:29:07 <tflink>  even the functions could have changed between versions
Jan 29 14:29:19 *  djf_jeff (~jeff@184-106-95-233.static.cloud-ips.com) has joined #fudcon-room-3
Jan 29 14:29:32 <tflink>  ie there are symantic differences between the different versions of python
Jan 29 14:30:21 <tflink>  * example on screen - not sure that I can type fast enough
Jan 29 14:30:52 <DiscordianUK>  there will hopefully be slides
Jan 29 14:31:07 <tflink>  talking about decrememnting the reference count inside a while loop and some of the potential problems in CPython implementation
Jan 29 14:31:18 <tflink>  DiscourdianUK: he said he would post them when he gets internet access
Jan 29 14:31:34 <tflink>  The good parts of CPython:
Jan 29 14:31:38 <bcl>  main loop is a giant switch() statement to process the .pyc opcodes
Jan 29 14:31:49 <tflink>  easy to bind to C code
Jan 29 14:31:56 <tflink>  (just please do it correctly)
Jan 29 14:32:22 <tflink>  you can wrap other C code with "python like" data types to be able to include it into python code
Jan 29 14:32:43 <tflink>  it is a rather simple implementation, in the grand scheme of things
Jan 29 14:32:49 <tflink>  the bad parts of CPython:
Jan 29 14:33:14 <tflink>  it is a bit slow since you're always interpreting the bytecode - never going to be as fast as machine code
Jan 29 14:33:32 <tflink>  since the language is so dynamic, you can't use a lot of the traditional optimizations for compile time
Jan 29 14:34:10 <tflink>  The Global Interpreter Lock is another disadvantage
Jan 29 14:34:25 <tflink>  * question - what about google's unladen swallow?
Jan 29 14:34:33 <tflink>  that was a project to add a JIT to CPython
Jan 29 14:35:09 <tflink>  they tried to take LLVM (low level virtual machine) which is a library that implements a lot of the things that a compiler could use
Jan 29 14:35:36 <tflink>  so you could construct fragements of code and say "give me machine code"
Jan 29 14:36:00 <tflink>  when a python code is being called 1000 with the same int valued
Jan 29 14:36:26 <tflink>  so the JIT would make machine code instead of a big switch statement
Jan 29 14:36:34 <tflink>  the hope was that it would provide a HUGE speedup
Jan 29 14:36:42 <tflink>  unfortunately, it was only about 20% speedup
Jan 29 14:37:10 <tflink>  since you are generating all that code at runtime, you are doing a LOT of checks (~5 conditionals before adding 2 ints)
Jan 29 14:37:24 <tflink>  in theory, you could optimize all of that away with clever coding
Jan 29 14:37:45 <tflink>  but the last word on the mailing list was that all the people at google who were working on this have moved on to other projects
Jan 29 14:38:01 <tflink>  at this time, there doesn't seem to be any people primed to take over the unladen swallow project
Jan 29 14:38:03 <tflink>  moving on ...
Jan 29 14:38:23 <tflink>  reference counting fun -> it is too easy to et it wrong and cause crash or other problems
Jan 29 14:38:36 <tflink>  Objects can't move around in memory and this can fragment the heap
Jan 29 14:39:06 <tflink>  lots of references twiddling-> impossible to have readonly data in shared memory pages (ie KBM's KSM)
Jan 29 14:39:22 <tflink>  THere is a non-opaque object API
Jan 29 14:39:30 <tflink>  the implementation of details are visable to C extensions
Jan 29 14:39:45 <tflink>  this makes them hard to change without breaking hundreds of extentions
Jan 29 14:39:57 <tflink>  example: strings are merely string + length
Jan 29 14:40:13 <tflink>  * notes that there isn't much time left
Jan 29 14:40:23 <tflink>  if you're going to do extensions, please use Cython
Jan 29 14:40:36 <tflink>  it auto-generates .c code, handling alot of the details
Jan 29 14:40:50 <tflink>  PLEASE don't use SWIG (this will probably be contravertial)
Jan 29 14:41:05 <tflink>  all you get are python objects taht wrap C and C++ pointers
Jan 29 14:41:12 <tflink>  * going faster now
Jan 29 14:41:18 <tflink>  Debug builds
Jan 29 14:41:30 <tflink>  Complile CPYthon --with-pydebug
Jan 29 14:41:49 <tflink>  adds lots of useful debugging instrumention and makes it easier to debug in gdb
Jan 29 14:41:52 <tflink>  but it is a LOT slower
Jan 29 14:42:33 <tflink>  keep in mind that while the .h files are the same between debug and non-debug builds, the .so files are NOT COMPATIBLE with the regular optimized python (There are ABI differences)
Jan 29 14:42:51 <tflink>  so for example, you couldn't run yum with debug python
Jan 29 14:42:57 <tflink>  -- Python 3
Jan 29 14:43:13 <tflink>  Python 3 is a big rewrite of CPython 2, fixing lots of long-standing problems
Jan 29 14:43:26 <tflink>  there are syntactic differences from Python 2
Jan 29 14:43:44 <tflink>  cnages in the standard lirary, different .pyc files
Jan 29 14:44:04 <tflink>  but it should be much nicer to use than Python 2 (in the speakers opinion)
Jan 29 14:44:17 <tflink>  there is slowly growing 3rd party module support
Jan 29 14:44:34 <tflink>  * question - Is the global lock gone in python 3?
Jan 29 14:44:45 <tflink>  no, it is there still
Jan 29 14:44:58 <tflink>  there are arch. issues that won't be fixed in Python 3
Jan 29 14:45:05 <tflink>  -> alternate Python
Jan 29 14:45:08 <tflink>  Jythhon:
Jan 29 14:45:24 <tflink>  Java base class: org.python.core.PyObject
Jan 29 14:45:38 <tflink>  Can wrap arbitrary objects in python
Jan 29 14:45:45 <tflink>  so what is the runtime of Jython?
Jan 29 14:45:50 <tflink>  it's java
Jan 29 14:45:59 *  badkittydaddy (95a9941d@gateway/web/freenode/ip.149.169.148.29) has joined #fudcon-room-3
Jan 29 14:46:05 <tflink>  the .py files are compiled to syntax treed, then converted directly into java bytecode
Jan 29 14:46:26 <tflink>  in theory, it should be fast (JIT-compiled machine code)
Jan 29 14:46:53 <tflink>  however, much of the time, Java bytecode is calling back into the code PyObject code which has to implement some messy switch statements
Jan 29 14:47:08 <tflink>  * q - doesn't java do a lot of the same things?
Jan 29 14:47:36 <tflink>  kind of, but java is a lot more static and there is some hacking in order to bridge the two worlds
Jan 29 14:47:45 <tflink>  so what are the advantages off jython?
Jan 29 14:47:57 <tflink>  you can embed it inside a java appserver
Jan 29 14:48:17 <tflink>  you can use the java garbage collecter (the CPython GC is not very good)
Jan 29 14:48:33 <tflink>  the java VM is perhaps the best open source runtime that we have:
Jan 29 14:48:52 <tflink>  JIT, GC, years of research and competition, no GIL
Jan 29 14:49:21 <tflink>  * missed question
Jan 29 14:49:58 <tflink>  the GIL tends to be an issue when you're trying to get max performance
Jan 29 14:50:13 <tflink>  for example, some of yum's perf issues come from talking to disk
Jan 29 14:50:23 <tflink>  you want it to have a simple interface, though
Jan 29 14:50:36 <tflink>  but when you are working on a script, how fast does it really have to be?
Jan 29 14:50:50 <tflink>  the performance issues are more prevalent in webserver space
Jan 29 14:50:59 <tflink>  GIL hampers multiprocessing
Jan 29 14:51:09 <tflink>  -> back to Jython
Jan 29 14:51:26 <tflink>  you can also use Java DB bindings in python code
Jan 29 14:51:31 <tflink>  but Jython is still at 2.6
Jan 29 14:51:39 <tflink>  ==> IronPython
Jan 29 14:51:53 <tflink>  IP is similar to Jython but on top of the CLR (.NET Runtime)
Jan 29 14:52:06 <bcl>  question was about usinthe the multiprocessing module instead of threads. It breaks them out into subprocesses instead of threads so multicores can be taken advantage of.
Jan 29 14:52:14 <tflink>  apparantly it works on Mono (not sure who is working on this)
Jan 29 14:52:39 *  jdob (95a97df7@gateway/web/freenode/ip.149.169.125.247) has joined #fudcon-room-3
Jan 29 14:52:48 <tflink>  *q didn't microsoft fire the person who was working on IP (not sure if I'm right on this - transcriber)
Jan 29 14:52:55 <tflink>  ... moving on to PyPy
Jan 29 14:53:10 <tflink>  PyPy is very different from the others talked about
Jan 29 14:53:28 <tflink>  it is an implementation of an interperter for the FULL python language (with JIT compilation)
Jan 29 14:53:37 <tflink>  it is written in a high level language
Jan 29 14:54:02 <tflink>  the implementation language is compiled down to .c code from which we get binary
Jan 29 14:54:11 <tflink>  can also compile to C#, Java etc.
Jan 29 14:54:29 <tflink>  PyPy is actually written in Python (hence PyPy)
Jan 29 14:54:46 <tflink>  * Diagram on slides about how many pythons
Jan 29 14:54:53 *  jdob (95a97df7@gateway/web/freenode/ip.149.169.125.247) has left #fudcon-room-3
Jan 29 14:55:21 <tflink>  you are supposed to run python code through the PyPy code, which spits out lots of generated c code
Jan 29 14:55:36 <tflink>  which should have the same behavior as the python code would have running through the interpreter
Jan 29 14:56:03 <tflink>  so you end up with c code that takes a while to translate but it can end up much faster than interpreted python code
Jan 29 14:56:13 <DiscordianUK>  PyPy has limitations
Jan 29 14:56:32 <tflink>  so we go through this strange process of translation to allow different optimization
Jan 29 14:56:55 <tflink>  pypy does have .pyc files by default (similar but different from CPython)
Jan 29 14:57:09 <tflink>  it is starting to have support for the CPython extension APIs
Jan 29 14:57:27 <tflink>  it is different, but everytime the speaker has tried them, it has segfaulted, crashed and burned
Jan 29 14:57:39 <tflink>  bcl: sorry, I missed your question, will try after talk
Jan 29 14:57:47 <tflink>  advantages of pypy
Jan 29 14:57:54 <tflink>  speed: see http://speed.pypy.org
Jan 29 14:58:10 <tflink>  it is fast because the object implementations are better and have smarter data structures
Jan 29 14:58:26 <tflink>  JIT: based on tracing itself, interpreting "hot" loops
Jan 29 14:59:02 <bcl>  tflink: I'm sitting behind you :) comment was about the multiprocessing question you missed.
Jan 29 14:59:17 <tflink>  bcl: thanks, just trying to keep up
Jan 29 14:59:45 <tflink>  memory usage should be better, because of smarter data structures
Jan 29 14:59:54 <tflink>  disadvantages of pypy:
Jan 29 15:00:06 <tflink>  currently at python 2.5 (2.7 is on its way)
Jan 29 15:00:16 <tflink>  6 million lines of augenerated .c code
Jan 29 15:00:25 <tflink>  they also only seem to care about the 2 archs
Jan 29 15:00:34 *  badkittydaddy has quit (Ping timeout: 265 seconds)
Jan 29 15:00:48 <tflink>  also wanted to go into packaging and other implementation
Jan 29 15:01:00 <tflink>  but he will probably leave that discussion for the mailing list
Jan 29 15:01:10 <tflink>  * out of time
Jan 29 15:01:25 <tflink>  ==> This is the end of the python presentation