tag:blogger.com,1999:blog-63632463791122492612024-03-13T16:05:57.452-04:00It's Not Easy Being GenesAnonymoushttp://www.blogger.com/profile/01078483442220289712noreply@blogger.comBlogger33125tag:blogger.com,1999:blog-6363246379112249261.post-92165617130783290522014-01-31T10:46:00.000-05:002014-04-28T18:40:36.737-04:00So you're attending your first Python users group meeting / meetup<p>
<a href="http://www.flickr.com/photos/petyr/8151854073/in/set-72157631920844847/" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" title="Full House by Paul Collins, on Flickr"><img alt="Full House" src="http://farm8.staticflickr.com/7121/8151854073_05a1585d08_c.jpg" height="158" style="cursor: pointer; float: left; height: 158px; margin: 0pt 10pt 10px 0px; width: 240x;" width="240" /></a>Congratulations! You have decided to attend your local Python users group (PUG) meeting (or "meetup" in the <a href="http://www.meetup.com/">Meetup</a> parlance), putting yourself on a path for success in mastering Python, and programming in general. Here are a couple of tips to help you arrive prepared and ready to participate.</p>
<h3>Attend</h3>
<p>The most important thing to bring to a PUG meeting is yourself and your enthusiasm. If you're in a rush and don't have time to set up your laptop or anything else, <b>don't worry, just attend!</b> You will be welcome regardless of your technical and personal background. The global Python community <a href="http://www.python.org/community/diversity/">expects an atmosphere of respect and welcoming at each event</a>, especially for newcomers.
</p>
<h3>
Bring a laptop</h3>
<p>
Each PUG will arrange its meetings differently. <a href="http://www.meetup.com/dcpython/">DCPython</a>, for example, tended towards mostly presentations while I was a member. In contrast, <a href="http://www.meetup.com/OCPython/">OCPython</a> has had less formal meetings where there's more interaction and audience participation. <a href="http://en.wikipedia.org/wiki/Scout_Motto">Be prepared</a>. Bring a laptop just in case.</p>
<h3>
Prepare your laptop for Python</h3>
<p>Everything in this section should be considered optional but very helpful. Try to get your laptop set up before the meeting. <b>Don't fret if you can't or don't have time, though!</b>. Wi-Fi access can't be guaranteed at a lot of venues, so it will be extremely helpful if you have all your software downloaded, and, if possible installed and configured, <i>before</i> the meeting.</p>
<p>
If you get stuck with any of the tasks below, don't fret, and do attend anyway! You will find other attendees willing to troubleshoot and get you un-stuck, I promise. The Python community is among the friendliest and eager to help.</p>
<h4>
Know how to open your terminal emulator</h4>
<p>
Most Python programs are invoked via the command line (typically by issuing a command that looks like "<span style="font-family: Courier New, Courier, monospace;">python <some_file.py></span>"). Learn how to access the <a href="http://en.wikipedia.org/wiki/Terminal_emulator">terminal emulator</a>/command line for your laptop's respective operating system, such as PowerShell for Windows, Terminal for OS X, or GNOME Terminal for Linux.</p>
<p>
Make it easy to get to the terminal emulator by putting a shortcut to it on your desktop, dock, or launcher. Oftentimes the presenter will assume you already have your terminal open and at the ready.</p>
<p>
Configure your terminal emulator to your liking ahead of time. For example, I find the default font size in OS X's Terminal too small and the colors too pastel to the point of being illegible. Finding the right places to change these settings to suit your style (and eyesight) can take some time, so get them fixed ahead of time. You won't want to be fiddling with these settings while the presenter's nuggets of advice go flying by.</p>
<p>
If you're unfamiliar with working on the command line, see <a href="http://cli.learncodethehardway.org/book/">Zed Shaw's Command Line Crash Course</a>.
<a name="working-python-installation"><h4>
Have a working Python installation</h4></a>
<p>
If you have a Mac or a Linux laptop, the good news is you already have Python installed; proceed onward.</p>
<p>
If you're a Windows user, you have a little work ahead of you. First, <a href="http://www.python.org/download/">download a Python installer from the Python website</a>. Just grab the latest version (Python 3.4 at time of publication), run the installer (double click the installer file), and go through the dialogs (you can trust the defaults).</p>
<p>
The final step is to put Python on the <span style="font-family: Courier New, Courier, monospace;">PATH</span> so that you can run Python from your command line. <a href="http://stackoverflow.com/a/4621277/38140">Here is a howto for Windows 7 users</a>. (Note: instead of using "<span style="font-family: Courier New, Courier, monospace;">Python27</span>" use a number matching your download, for example "<span style="font-family: Courier New, Courier, monospace;">Python34</span>" if you downloaded and installed Python3.4)</p>
<p>
You can verify you put Python on your path by opening PowerShell, typing in "<span style="font-family: Courier New, Courier, monospace;">python</span>", and pressing the <span style="font-family: Courier New, Courier, monospace;">Enter</span> key. If you see a prompt that looks like "<span style="font-family: Courier New, Courier, monospace;">>>></span>" you're good to go. If instead you get an error that "python" isn't found, try going back through the steps of setting your path. If you get stuck here, ask for assistance at your meeting, show your helper this blog post, and they should be able to get you back on track.</p>
<h4>
Have a text editor or IDE installed and configured</h4>
<p>
Python programs are plain text files with .py extensions. While you could use Microsoft Word (heaven forbid) to open and edit the files, you should use a specialized tool for editing code. If you're new to programming, you'll probably do fine starting with a good <a href="http://en.wikipedia.org/wiki/Text_editor">text editor</a> that provides syntax highlighting and help with formatting your code correctly. If you're on Linux, you probably already have a decent editor like <a href="https://wiki.gnome.org/Apps/Gedit">Gedit</a> installed. If you're on Mac OS X try something like <a href="http://www.sublimetext.com/">Sublime Text</a>. If you're on Windows, try <a href="http://notepad-plus-plus.org/">Notepad++</a>.</p>
<p>
If you have some experience in programming in other languages or have the interest in investing more time upfront, you can install and get started using an <a href="http://en.wikipedia.org/wiki/Integrated_development_environment">integrated development environment (IDE)</a>. If you don't yet have a favorite IDE for Python development, try <a href="http://www.jetbrains.com/pycharm/">PyCharm</a> or <a href="http://ninja-ide.org/">Ninja IDE</a>. IDEs provide not just a text editor but also a debugger, a test runner, an interactive console, and some form of project management, among other tools. Again, if your a newbie, just stick to getting a good text editor set up before attending the meeting.</p>
<h4>
Have pip installed</h4>
<p>
<a href="http://www.pip-installer.org/">pip</a> allows you to easily download and install other Python software libraries (called "Python packages") from the Internet. All you have to do is open your terminal, type in "<span style="font-family: Courier New, Courier, monospace;">pip install <PACKAGE_NAME></span>", and press <span style="font-family: Courier New, Courier, monospace;">Enter</span>, and pip will fetch the package from the Internet and install it for you. pip can come in quite handy if you would like to try out any packages mentioned or shown in tutorials during the meeting.</p>
<p>
If you installed Python 3.4 or newer, you already have pip installed. Proceed onward!</p>
<p>
If you have Python 3.3 or older (including Python 2.7), you will not have pip installed by default (unless you used an alternative Python distribution like <a href="https://store.continuum.io/cshop/anaconda/">Anaconda</a>). You can check whether or not you have pip installed by typing in "<span style="font-family: Courier New, Courier, monospace;">pip</span>" on your command line and hitting the <span style="font-family: Courier New, Courier, monospace;">Enter</span> key. If you get an error that the pip command cannot be found, you will need to install pip.</p>
<p>
<a href="http://www.pip-installer.org/en/latest/installing.html">The pip documentation describes how to install pip</a>. The <a href="http://www.pip-installer.org/en/latest/installing.html#install-pip">easiest method uses the <span style="font-family: Courier New, Courier, monospace;">get-pip.py</span> script</a>. If this fails for you, you may need to try one of the alternative install options.</p>
<p>
This may be a more difficult step for Windows users. First, make sure that Python is on your <span style="font-family: Courier New, Courier, monospace;">PATH</span> (see <a href="#working-python-installation">"Have a working Python installation"</a> above). Next, download the get-pip.py script by right-clicking the link, selecting "Save As", and designating a location to save it, such as your "Downloads" directory. Next, open PowerShell and use the <span style="font-family: Courier New, Courier, monospace;">cd</span> command to move to your download directory (e.g., "<span style="font-family: Courier New, Courier, monospace;">cd $HOME\Downloads</span>". Finally, in PowerShell, run <span style="font-family: Courier New, Courier, monospace;">python get-pip.py</span>.</p>
<p>
After the installation of pip completes, you should be able to access pip from the command line by typing in <span style="font-family: Courier New, Courier, monospace;">pip</span> and hitting Enter. Again, if you get stuck, show this blog post to someone at the meeting and he or she should be able to help you.</p>
<h3>
Bring some extra equipment (optional)</h3>
<p>
In an ideal PUG event, organizers will have time and resources to set up the event area with ample access to power, Wi-Fi, and the like. If your local PUG is new or just getting started, though, the organizers may have a hard enough time just finding a location to meet, let alone have time to prep the area for lots of tech. Here's some equipment that you can bring along that could help everybody have a better experience.</p>
<ul>
<li>A wireless hotspot device</li>
<li>Display dongles, adapters, and cables</li>
<li>Power strips</li>
<li>Business cards</li>
</ul>
<h3>
Have a good time; let others have a good time</h3>
<p>
PUG meetings are social events, not formal meetings. Some may run on tight schedules because they pack a lot of content in, however, don't confuse that structure with formality. Everyone attends PUG meetings first and foremost to share in the joy of the Python programming language. Unless you're presenting (and usually even if you are), you don't need to dress up; just come in something comfortable and inoffensive.</p>
<p>Do conduct yourself appropriately at the meeting, and err on the side of professionalism. The <a href="https://us.pycon.org/2014/about/code-of-conduct/">PyCon Code of Conduct</a> provides a good set of guidelines. If another attendee behaves inappropriately towards you or makes you feel uncomfortable or unwelcome, do feel free to confront the attendee on her or his behavior directly or otherwise raise the issue with the PUG's organizers immediately.</p>
<p>
On behalf of your local PUG, we look forward to seeing you soon!</p>Anonymoushttp://www.blogger.com/profile/01078483442220289712noreply@blogger.com0tag:blogger.com,1999:blog-6363246379112249261.post-4988344347695064592012-12-06T15:50:00.000-05:002012-12-06T23:00:26.985-05:00Putting syntax-highlighted code into presentation slides or documents<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.flickr.com/photos/rinoshea/7702900836/" title="Brogramming with Tom by ryanoshea, on Flickr"><img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer; width: 240px; height: 240px;" src="http://farm9.staticflickr.com/8022/7702900836_0106e4e49a_m.jpg" width="240" height="240" alt="Brogramming with Tom"></a>
<p>Want to include syntax-highlighted code in your presentation? A project called <a href="http://pygments.org/">Pygments</a> provides a very helpful tool for this. First, you need to install Pygments:
<ul>
<li>If you're on OS X and use MacPorts, you can fetch it with
<pre>sudo port install py27-pygments</pre>
</li>
<li>If you're on Ubuntu/Debian, you can get it with apt-get with
<pre>sudo apt-get install python-pygments</pre>
</li>
<li>Or you can fetch it on any platform using <a href="http://www.pip-installer.org/">pip</a><a href="#fn1">*</a> with
<pre>sudo pip install pygments</pre>
</li>
</ul>
</p>
<p>Installing Pygments will also install a command line utility called <tt>pygmentize</tt><a href="#fn2">**</a>. We can use this tool to help us format code for use in a presentation or document with the following steps
<ol>
<li>Open the terminal and do
<pre>pygmentize -f rtf <PATH_TO_CODE_FILE> | pbcopy</pre>
if you're on OS X, replacing <tt><PATH_TO_CODE_FILE></tt> with the actual path to your file of interest (use <tt>xsel -b</tt> instead of <tt>pbcopy</tt> if you're on Linux, or <a href="http://www.labnol.org/software/tutorials/copy-dos-command-line-output-clipboard-clip-exe/2506/"><tt>clip</tt></a> if you're on Windows). This will copy a colorized markup of your code to your clipboard.
</li>
<li>Paste the contents of the clipboard to your document or slide
<ul>
<li>This should be as simple as using <b>Edit → Paste</b> (<tt>COMMAND + V</tt> on OS X, <tt>CTRL + V</tt> on anything else)</li>
<li>If you're using PowerPoint, you instead need to use <b>Edit → Paste Special</b> (<tt>CTRL + COMMAND + V</tt> shortcut on Mac Office) and select "Formatted Text (RTF)". Also, you may need to create a new text box first, as the default text box will unhelpfully try to insert bullet points for you. Alternatively, you can just remove the bullet points by highlighting all the code and clicking the bullet point button (sometimes having to do this multiple times...)</li>
</ul>
</ol>
</p>
<p>That's it! <tt>pygmentize</tt> can parse files in a wide number of programming languages, as well as output in many different formats; for example, if you use LaTeX/Beamer, you can get TeX output by using <tt>-f tex</tt>. You can look at the Pygments documentation on <a href="http://pygments.org/docs/lexers/">lexers</a> and <a href="http://pygments.org/docs/formatters/#formatter-classes">formatters</a> to see the full list of languages and output formats Pygments and <tt>pygmentize</tt> support.</p>
<p><a name="fn1">*</a> Don't have pip? <a href="http://www.pip-installer.org/en/latest/installing.html">Go get it</a>!
<br />
<a name="fn2">**</a> If you installed with MacPorts, you'll probably have to use <tt>pygmentize-2.7</tt> instead of <tt>pygmentize</tt>.
</p>Anonymoushttp://www.blogger.com/profile/01078483442220289712noreply@blogger.com0tag:blogger.com,1999:blog-6363246379112249261.post-69793218736662916062011-09-10T15:41:00.006-04:002011-09-11T11:16:04.094-04:00In remembrance of September 11, 2001<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.flickr.com/photos/jcolman/441030585/" title="American Flag picture - photo of the American Flag by jcolman, on Flickr"><img style="margin: 0pt 10pt 10px 0px; float: left; cursor: pointer; width: 240x; height: 180px;" src="http://farm1.static.flickr.com/171/441030585_84546b0a5c_m.jpg" width="240" height="180" alt="American Flag picture - photo of the American Flag"></a><p>Today, I set aside some time from preparing for my defense to read through <a href="http://www.nytimes.com/interactive/us/sept-11-reckoning/viewer.html">the New York Times' tribute to September 11, 2001</a>, and reflect upon what that day and the events which have followed mean to me. I did not expect to feel so profoundly moved as I read through the stories, and in particular, I could not help but feel struck afresh with anguish and cry as I carefully paged through the <a href="http://www.nytimes.com/interactive/2011/09/08/us/sept-11-reckoning/towers.html">moving slideshow of the rise and fall of the towers of the World Trade Center</a>.</p>
<p>Still, <a href="http://www.nytimes.com/2011/09/08/us/sept-11-reckoning/dwyer.html">other articles</a> reminded me of my core belief in our country — in the people of our country. In spite of the willful erosion of <a href="http://www.nytimes.com/2010/11/19/business/19security.html">personal privacy</a> and <a href="http://www.nytimes.com/2011/09/07/us/sept-11-reckoning/civil.html">civil liberties</a> and <a href="http://query.nytimes.com/gst/fullpage.html?res=9805E5D71330F933A15751C1A96F9C8B63">civil tongues</a>, in spite of the tragic sacrifice of human lives both <a href="http://www.cnn.com/SPECIALS/war.casualties/">domestic</a> and <a href="http://www.iraqbodycount.org/">foreign</a>, in spite of ongoing <a href="http://www.npr.org/2011/08/23/139852035/shrimp-on-a-treadmill-the-politics-of-silly-studies">anti-intellectualism</a>, in spite of continuing sexual, religious, and racial intolerance, in spite of a bitterly polarized political climate, in spite of our <a href="http://en.wikipedia.org/wiki/Deepwater_Horizon_oil_spill">continued mismanagement of our environment</a> — in spite of all this, I still believe that the story of the United States of America is one of hope. If ever there were a country to break pre-conceived notions, to defy intolerance, to unite for a greater good, to show that change can be for the better, to overcome adversity, then it must be ours.</p>
<p>Ten years ago, I stood with friends in an undergraduate dorm room and watched the World Trade Center towers collapse and the Pentagon smolder. Now, here I stand to defend my Ph.D., and I can not help but feel grateful for all the opportunities I've had thanks to having a life here in the USA. I am not always proud of our country's actions, but I am proud of what our country stands for: truth, liberty, and justice for all. Our story is marked by tragedy and marred by missteps, but it is, indeed, the story of hope. I will always remember.</p>Anonymoushttp://www.blogger.com/profile/01078483442220289712noreply@blogger.com2tag:blogger.com,1999:blog-6363246379112249261.post-82986288907052461552011-08-09T01:25:00.008-04:002014-04-02T02:49:23.198-04:00The bog of eternal singlehood: college towns beyond college<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.flickr.com/photos/kandypics/223038889/" title="Life in a college town by rejohnson71, on Flickr"><img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer; width: 240px; height: 160x;" src="http://farm1.static.flickr.com/64/223038889_f92d2d226f_m.jpg" width="240" height="160" alt="Life in a college town"></a>
<p>As I apply for jobs, many of which, for better or worse, are at academic institutions, I keep having a nagging feeling tugging at the back of my mind, like the tantrum-throwing three year old desperate for that Yo Gabba Gabba doll tears at her parent's arm in aisle 14 of the local Target. This pressing thought which brings me so much strife: I'm just not sure if I can take living in yet another land-grant college town.</p>
<p>Don't get me wrong—there are many great things about living in a college town. Life is generally quite pleasant and quiet, save football weekends. The cost of living is usually fantastic. They also tend to be family friendly, with quaint little farmers' markets and little local restaurants and shops. They also tend to be fairly progressive and open-minded, and support culture and art to a greater extent than you'd expect from such a small population.</p>
<p>Yes, for many people, a college town is a rather idyllic place. There is a specific subpopulation in these college towns, however, for whom the experience becomes utterly hopeless. This subpopulation: those who move to college towns, are not college-aged, and arrive without a significant other. Meet those requirements, and you're basically hosed until you escape. It is the bog of eternal singlehood.</p>
<p>I mean, let's take an honest look at the candidates in the dating pool in a college town for those who already hold one or more higher education degrees:</p>
<ul>
<li><span style="font-weight:bold;">College kids</span>: I'm sorry, did you not see the word "kids" there?</li>
<li><span style="font-weight:bold;">Grad students</span>: Emotionally unstable semi-adults who incorrectly concluded that the panacea to their life problems was to get yet another degree.</li>
<li><span style="font-weight:bold;">Postdocs</span>: Does the sound of frantic typing as they try to finish their latest lit review during the act of love-making turn you on?</li>
<li><span style="font-weight:bold;">Junior faculty</span>: Ah, the less youthful, less healthy, more stressed versions of postdocs. Yes, I'm sure you had a good reason behind that choice...</li>
<li><span style="font-weight:bold;">Staff</span>: They probably arrived there because of a significant other; if they are single at this point, they're looking for an opportunity to flee, not to stay.</li>
<li><span style="font-weight:bold;">Hipster/Hippie Townies</span>: It's okay, so long as their friends never find out they're sleeping with you. Oh wait, it's a small college town...</li>
<li><span style="font-weight:bold;">Folk in the surrounding countryside</span>: don't be surprised if you're viewed an over-educated, heathen, pinko socialist who never learned how to do anything actually useful (all of which could be accurate assessments)</li>
<li><span style="font-weight:bold;">People in the nearest city... five hours away</span>: They're already pairing up with equally smart, young, attractive, better-paid competition that had the foresight to not force the issue of a long-distance relationship on the first date.</li>
</ul>
<p>As a consolation, you will find great friends, for whom your sad, lonely, single self will serve as a reminder of why they need to stay committed to their own relationships.</p>
<p>With complete seriousness, I've found a tremendous amount of personal growth in the college towns I've inhabited for the past twelve years, and certainly, the quality of friends I've found in them has been unsurpassed. I admit that location is really only one part of the whole romantic equation.</p>
<p>Anyway, we'll see what the future brings. Maybe I'll finally join the young guns in a big ol' city, myself. Or maybe I'll find the the one who breaks the mold. Or maybe it'll just be the status quo, but hey, there are <a href="http://www.youtube.com/watch?v=2TY8T9iTUxc">far worse bogs out there</a>!</p>Anonymoushttp://www.blogger.com/profile/01078483442220289712noreply@blogger.com0tag:blogger.com,1999:blog-6363246379112249261.post-14701537139466897562011-04-29T18:23:00.008-04:002011-06-25T19:58:02.876-04:00Let's talk: designing inter-cellular circuits through synthetic biology<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.flickr.com/photos/cizake/4164756091/" title="True phone by Florian SEROUSSI, on Flickr"><img style="margin: 0pt 10pt 10px 0px; float: left; cursor: pointer; width: 240x; height: 119px;" src="http://farm3.static.flickr.com/2497/4164756091_80f19ce3e2_m.jpg" alt="True phone" height="119" width="240" /></a>
<p>Thursday, <a href="http://www.gbcb.org.vt.edu/">GenBioOrg</a> brought in <a href="http://groups.csail.mit.edu/synbio/users/rweiss/">Prof. Ron Weiss</a> to speak about his work in designing biological circuits, and this crazy, sometimes hyped field of synthetic biology. Prior to Prof. Weiss's talk, while I could appreciate the idea of synthetic biology, I mostly regarded it as a somewhat foolish pursuit, on account of the amount of fundamental biology we still just do not know. In many ways, my field of computational systems biology relies on building and testing models from "Swiss cheese knowledge", where gaps prevail (e.g., protein-protein interaction networks built from <a href="http://en.wikipedia.org/wiki/Two-hybrid_screening">yeast-two-hybrid</a> studies with high false-positive rates, or <a href="http://en.wikipedia.org/wiki/DNA_microarray">microarray</a> analysis suffering from from the high-dimensionality, low-sample conundrum). Thus, whatever decries the prematurity of systems biology goes doubly so for synthetic biology, for which systems biology provides a central strut. The rationale is, if you don't understand it, how can you manipulate it? Of course, as I've learned repeatedly (but have failed to generalize), "You don't need to understand the internal combustion engine to drive a car."</p>
<p>Well, Thursday afternoon, Prof. Weiss deftly reminded me of this reality through his combination of humility-tempered optimism, his impressive collection of proofs-of-concept, and his insight for possible applications. He presented a number of intriguing biological circuits in his talk, but I felt most excited by his work on <a href="http://www.nature.com/nature/journal/v434/n7037/full/nature03461.html">pattern formation through synthetic inter-cellular signaling networks</a> (behind a Nature paywall, sorry). In this work, Weiss and his colleagues created a population of "receiver" bacteria cells, which had a genetic circuit that would cause cells to fluoresce (light up green) at a moderate concentration of a molecule called acyl-homoserine lactone (AHL).</p>
<p>To make the receivers fluoresce within a specific concentration of AHL, Weiss and colleagues actually made the receiver cells fluoresce (i.e. "be on") by default. They then created two AHL-detection circuits with very different input thresholds: a high-detection circuit which activates in the presence of large amounts of AHL, and a low-detection circuit which activates in low amounts or in the absence of AHL. Weiss and colleagues wired both detectors to the same output: when activated, they repressed ("turned off") fluorescence. If you're familiar with electronics, you'll see that Weiss and colleagues constructed a <a href="http://en.wikipedia.org/wiki/NOR_gate">NOR gate</a>, where the inputs are "high AHL" and "low/no AHL". If you're a programmer, you might think of the condition for fluorescence as
<pre class="brush: python">if not (ahl_level > high_threshold) and not (ahl_level < low_threshold):
cells.fluoresce()
</pre>
</p>
<p>Weiss and colleagues then developed "sender" cells containing a circuit that caused synthesis and secretion AHL when exposed to tetracycline. When a colony of sender cells was placed in the middle of a "lawn" of receiver cells and exposed to tetracycline, the sender cells emitted AHL, which then diffused as a radial gradient from the colony, resulting in a concentric ring of fluorescence around the sender colony, but not immediately touching it, like a <a href="http://en.wikipedia.org/wiki/Bullseye_%28target%29">bullseye</a>. That is right by the sender colony, the AHL was highest, and so the high-detection AHL circuit shut off fluorescence and left those cells dark. A little further out, the levels of AHL that diffused from the senders was at a more moderate amount, so the high-detection and the low-detection circuits remained off, allowing those cells to fluoresce. Beyond those cells, the levels of AHL were too low, and though the high-detection circuit remained off, the low-detection circuit turned on and repressed the fluorescence, again.</p>
<div style="text-align:center;font-size:0.9em">
<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/-V3DW1hCQZ2E/Tb7RuwJlunI/AAAAAAAAA_Y/rY6caCTASzE/s1600/weiss_ellipse.png"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 378px;" src="http://1.bp.blogspot.com/-V3DW1hCQZ2E/Tb7RuwJlunI/AAAAAAAAA_Y/rY6caCTASzE/s400/weiss_ellipse.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5602145587624786546" /></a>
Colony of sender cells, fluorescing red, placed in a lawn of receiver cells. The sender colonies secrete signaling molecule AHL, which diffuses through the media. Receiver cells a sufficient distance from the colonies will receive enough AHL to fluoresce green, while those too near or too far will receive too much or to little AHL, respectively, remaining dark. [Image obtained from Ron Weiss with permission, modified by CDL to include labels.]</div>
<p>While this makes for pretty pictures, taxpayers rest assured: glowing cells are only the proof of concept. This research has major implications for practical applications, for example, in stem cell research, tissue engineering, and bioengineering.</p>
<p>As a high school student, I felt incredibly excited to learn the answer to the question, "How can <a href="http://en.wikipedia.org/wiki/Blastula">a ball of indistinguishable cells</a> turn into a brain, limbs, skin, etc.?" The answer, as those of you with some developmental biology background know, is "Through protein gradients," and more specifically through transcription factors and their co-activators and co-repressors. Beginning with your mother's egg cell, there already existed protein gradients which pre-determined the regions that formed your head, or your feet, or your inner organs, and as your zygotic cells divided, these protein gradients begot even more protein gradients, in a beautiful choreography perfected through billions of years of evolution. This research by Prof. Weiss and his colleagues demonstrates that synthetic biology may provide a means to not only guiding stem cells (either derived from an embryo or returned to their embryo-like stage) through the difficult process of differentiating into other cell types when cued by specific protein concentrations, but also the means to create colonies of cells capable of producing protein gradients. Through a successful combination of these sender-recipient circuits, we could achieve multiple types of differentiated cells, and maybe even self-organizing tissues, all from the same culture of stem cells.</p>
<p>Likewise, this research has important implications in mixed cell cultures. For example, the liver is primarily composed of cells called hepatocytes, which perform most of the functions of the liver, such as detoxification, lipid homeostasis, and blood plasma production. However, by <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0015456">culturing hepatocytes together with another cell type found in the liver, called liver sinusoidal endothelial cells (LSECs)</a>, the hepatocytes maintain their "liver-ness" far better than when cultured alone. Weiss's research implies that we may some day be able to develop synthetic "surrogate" cells to support cells that are characteristically difficult to maintain <span style="font-style: italic;">ex vivo</span> by providing important intercellular signals.</p>
<p>In terms of bioengineering applications, such as biodiesel or pharmaceutical production, a major stumbling block has been the difficulty in engineering biological systems with the biochemical capacities necessary to carry out each step necessary to manufacture a complex molecule. Weiss's research suggests growing practicality in molecule manufacturing by designing chains biological pathways that exist in separate organisms, much as the case for <a href="http://en.wikipedia.org/wiki/Hydrothermal_vent">deep-sea vents ecosystems</a>.</p>
<p>Two other profound discoveries that Weiss presented were completely counterintuitive to me: adding complexity to a biological circuit tends to 1) bring about more digital (on/off) behavior rather than analog (continuous gradient from low to high) behavior, and that coupling components tends to reduce noisiness in the circuit rather than increase it. Although I do not have time to recapitulate Prof. Weiss's demonstrations of these emergent behaviors, I encourage you to <a href="http://www.ncbi.nlm.nih.gov/pubmed?term=weiss%20ron">browse through his publications</a> yourself.</p>
<p>The last two points I'd like to note from Prof. Weiss's talk are the following quips, which I found particularly encouraging (paraphrasing). First:
<blockquote>Computational simulation is absolutely central to synthetic biology. We are beyond the point where we can design biological circuits through intuition alone. —Prof. Ron Weiss</blockquote>
This statement makes me feel validated for pursuing a background in computational biology. Second:
<blockquote>We've been working on a project for eight years now that we still haven't published results from. We're very close, though. It will be just another year or so. At least, that's what I tell my graduate student. And the graduate student that takes the project after she graduates. And the one after she graduates. —Prof. Ron Weiss</blockquote>
Researchers with careers as illustrious as Prof. Weiss's can come in and dazzle us grad students with tales of field-changing success, and I think this gives unreasonable and unwarranted expectations of how our own research paths should go. Certainly in my case I've felt that because I've struggled, I must not be successful, because it rarely seems the successful people struggled. It's refreshing to see an admirable figure in his field open up and show vulnerability by admitting that, even to this day, he has his struggles.</p>
<p>To summarize, here are my takeaway thoughts from Prof. Weiss's talk:
<ul>
<li>The time for synthetic biology research is now.</li>
<li>Researchers can engineer cell-cell communication, beginning the era of human-designed mixed cell cultures.</li>
<li>Even excellent researchers struggle.
</li>
</ul>
</p>Anonymoushttp://www.blogger.com/profile/01078483442220289712noreply@blogger.com0tag:blogger.com,1999:blog-6363246379112249261.post-71592623840166088012011-03-28T10:00:00.007-04:002012-06-06T08:58:16.451-04:00Driven by the pursuit of proficiency<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.flickr.com/photos/beatkueng/2699418104/" title="velocity by 'PixelPlacebo', on Flickr"><img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer; width: 240px; height: 146px;" src="http://farm4.static.flickr.com/3295/2699418104_fc844dbf25_m.jpg" width="240" height="146" alt="velocity" /></a>
<p>I am looking for a job, and as this is only the third time in my life job hunting, I have sought advice anywhere I can get it. Like most universities, Virginia Tech has a <a href="http://www.career.vt.edu/">Career Services office</a>, so I consulted their website to help get things rolling. They suggest beginning with a self-assessment, and the very first item of this self assessment bluntly asks, "What do you want to achieve in your work?" While this question is frustratingly broad, it is fair game; one could expect such a question in an interview, and one certainly must have an answer ready.</p>
<p>I have never felt guided by some vision of how my life should be. I mean, sure, when I was 8, I wanted to fly an <a href="http://en.wikipedia.org/wiki/Top_Gun">F-14 Tomcat and shoot down commie MiGs</a> because that made you a hero, and when I was 13, I wanted to be a <a href="http://en.wikipedia.org/wiki/The_Life_Aquatic_with_Steve_Zissou">Marine Biologist</a> because of National Geographic and NOVA PBS shows, and when I was 18 I wanted to be a physician because that's what all biology majors intend to be. Each one of those were fantasies—spurious projections of the possible me, based on the immature and incomplete value system I held at the moment. When I gave up on the gauntlet of medical school my junior year at <a href="http://www.uga.edu/">UGA</a>, I also gave up pretending I could calculate long term career trajectories. Despite blowing off this central tenant of many a cookie-cutter career book and commencement address, I've thus far avoided becoming a complete and utter catastrophe of a human being, so I continue to make without.</p>
<p>Now I'm a grad student in computational biology—the result of a few simple ingredients: 1) I've enjoyed computer programming since high school, and 2) I've found biology fascinating since the days of reading <a href="http://www.zoobooks.com/">Zoobooks</a> at the dinner table. I have also loved video games since grade school, but I didn't take the career path of a video game programmer because I didn't feel like I would make a substantial contribution to humankind. In contrast, I quit trying to become a physician—a career in which I would have had a direct and tangible impact on other people's lives—because I saw the competition was better than I was at jumping through the med school application hoops. (I also love playing guitar, but let's be realistic—although <a href="https://twitter.com/#!/gotgenes/statuses/27556652228026368">I've re-evaluated that option</a> and there are worse things.)</p>
<p>According to <a href="http://www.hulu.com/watch/4183/saturday-night-live-down-by-the-river">motivational speaker</a> <a href="http://www.danpink.com/">Dan Pink</a>, motivation boils down to three needs: <a href="http://www.youtube.com/watch?v=u6XAPnuFjJc">mastery, autonomy, and purpose</a> (video below if you are unfamiliar with Pink's theory). From that standpoint, I didn't feel a sense of mastery in my quest to become a physician, and I didn't feel a sense of purpose in my pursuit of game programming, but with a decent grasp of biology and a propensity for programming, computational biology seemed a good fit. Now the question stands, has it been?</p>
<p>From the standpoint of fulfilling the need for purposeful work, I have to say I have certainly experienced a boost in motivation after <a href="http://igotgenes.blogspot.com/2008/12/today-i-quit.html">switching research groups</a>, due in large part to shifting the biological subject from bacteria to <span style="font-style:italic;">in vitro</span> liver tissue culture systems, which has more immediate implications for human health—a subject which still motivates me.</p>
<p>Considering proficiency, though, I feel very uncertain about my path in research. My RSS feeds continue to fill up with table-of-contents from journals faster than I can screen them for interesting abstracts. Also, although I don't <a href="http://igotgenes.blogspot.com/2008/10/saying-when-to-literature.html">reading through literature in the field as much as before</a>, I still just dislike doing it. I think this indicates a major obstacle to a career in research because it breaks the virtuous cycle of positive feedback: what we like, we do more of, so we get better at it, which makes us like it more and do more of it, which makes us better at it, and so on.</p>
<p>It's not clear I've grown much as a presenter, either, though it's not for lack of opportunities. I've given at least one presentation a month, sometimes several, mostly to my two research groups, but with some conference and departmental talks, as well. While I've gotten better at recognizing the work pattern that goes into preparing a presentation, I don't feel I've been able to reduce the time it takes to prepare them, and while I feel I've improved in delivery technique, I feel disappointed at how little I've improved given the amount of time I've invested. This said, I have discovered I enjoy delivering a presentation for which I've prepared adequately, which I attribute to the performance aspect.</p>
<p>If we take a look at the most important currency in academia, publications, I'm far from flush, with one co-authorship on <a href="http://www.springer.com/computer/bioinformatics/book/978-0-387-09759-6">a book chapter</a>, one second-authorship on <a href="http://www.liebertonline.com/doi/abs/10.1089/ten.TEC.2010.0012">a collaboration paper</a>, and one first-authorship on an <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0015247">original research article</a> (Open Access, yay!). I'm working on another paper currently, and should begin another one prior to defending in June. It's not a sparse record, but it's unremarkable. If I have learned anything from <a href="http://sethgodin.typepad.com/the_dip/">The Dip</a>, it's that I want to do remarkable work.</p>
<p>I really want to become proficient, but after nine years of working in academic research from undergrad, to research tech, to grad student, I feel it's escaping me in this pursuit. My research experience feels like long periods of slogging, largely devoid of any feedback, let alone positive feedback (which is rare and fleeting). I want a research experience that breaks that mold, but I'm willing to accept I might not find one, and I'm becoming more enthusiastic about switching tracks to a career where I can make a genuine success of myself. I want that virtuous cycle of positive feedback. I want to <a href="http://www.flickr.com/photos/blackbeltjones/3365682994/">get excited and make things</a>!</p>
<p>So, to the future interviewer who asks, "What do you want to achieve in your work?" I answer this: "I want to achieve remarkable proficiency." Why settle for less? Life is short; let's find a way to become awesome while we still can.</p>
<iframe title="YouTube video player" width="640" height="390" src="http://www.youtube.com/embed/u6XAPnuFjJc" frameborder="0" allowfullscreen></iframe>Anonymoushttp://www.blogger.com/profile/01078483442220289712noreply@blogger.com0tag:blogger.com,1999:blog-6363246379112249261.post-19349264957248712672011-01-07T13:03:00.003-05:002011-06-25T20:07:03.067-04:00Common Good: Adding a Creative Commons License button to your Blogspot (Blogger) blog<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.flickr.com/photos/qthomasbower/3640362081/" title="2500 Creative Commons Licenses by qthomasbower, on Flickr"><img style="margin: 0pt 10pt 10px 0px; float: left; cursor: pointer; width: 240x; height: 240px;" src="http://farm4.static.flickr.com/3396/3640362081_a27c43de6e_m.jpg" alt="2500 Creative Commons Licenses"/></a>
<p>I have intended to place the contents of this blog under a <a href="http://creativecommons.org/">Creative Commons (CC) license</a> for a long while, especially given that all the attractive photos I love to use in these blog entries come from <a href="http://www.flickr.com/creativecommons/">Creative Commons-licensed content on Flickr</a>. For those unfamiliar with <a href="http://creativecommons.org/licenses/">Creative Commons licenses</a>, they explicitly permit re-use of creative works <span style="font-style:italic;">a priori</span>. Provided you follow the criteria of the particular CC license of the work (usually simply attributing the original creator), you may simply use, or even modify the work, without the need to contact the original creator for direct permission to do so. Read the <a href="http://creativecommons.org/licenses/">Creative Commons' website</a> for more detail.</p>
<p>I had let this task linger far too long, so, spurred on by a recent email exchange with <a href="http://twitter.com/#%21/science3point0">Mark Hahnel</a> of <a href="http://www.science3point0.com/">Science 3.0</a>, I finally felt the inspiration to get this done. Unfortunately, I didn't find the top-ranked pages in Google searches for placing a CC license button on Blogger/Blogspot blogs very helpful, so I decided to just figure it out. It turned out to be a simple process, so I documented it and present it here, in step-by-step format (all under the CC-BY license, of course):</p>
<ol>
<li>Go to the <a href="http://creativecommons.org/choose/">Creative Commons website and choose a license</a>
<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_5SL_dsq8Oas/TSdIYCqzFQI/AAAAAAAAA9A/PHRI4EHIVeo/s1600/select-cc-license.png"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 358px; height: 400px;" src="http://1.bp.blogspot.com/_5SL_dsq8Oas/TSdIYCqzFQI/AAAAAAAAA9A/PHRI4EHIVeo/s400/select-cc-license.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5559491842882606338" /></a>
</li>
<li>Copy the HTML that CC presents you after you've selected your license
<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_5SL_dsq8Oas/TSdIEI6Cp1I/AAAAAAAAA80/C74OWop25Yw/s1600/cc-html.png"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 298px;" src="http://4.bp.blogspot.com/_5SL_dsq8Oas/TSdIEI6Cp1I/AAAAAAAAA80/C74OWop25Yw/s400/cc-html.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5559491500959770450" /></a>
</li>
<li>Go to your blog's page, and click the "Design" link in the navigation bar at the top.
<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_5SL_dsq8Oas/TSdIhDbq5vI/AAAAAAAAA9I/VO1v_8Q3AZQ/s1600/design-link.png"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 204px;" src="http://4.bp.blogspot.com/_5SL_dsq8Oas/TSdIhDbq5vI/AAAAAAAAA9I/VO1v_8Q3AZQ/s400/design-link.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5559491997706413810" /></a>
Alternatively, go to your <a href="http://www.blogger.com/home">Blogger author page</a> and click the appropriate "Design" link for your blog there.
</li>
<li>Click the "Add a Gadget" link in the design editor (should be one at the bottom of the area).
<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_5SL_dsq8Oas/TSdImLkQZNI/AAAAAAAAA9Q/gWJK9xffA-8/add-gadget-link.png"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 291px;" src="http://3.bp.blogspot.com/_5SL_dsq8Oas/TSdImLkQZNI/AAAAAAAAA9Q/gWJK9xffA-8/add-gadget-link.png" border="0" alt="" /></a>
</li>
<li>Click the link to add an "HTML/JavaScript" gadget.
<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_5SL_dsq8Oas/TSdIrfOL-mI/AAAAAAAAA-c/7zt3fc0kt3Y/choose-gadget.png"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 364px;" src="http://4.bp.blogspot.com/_5SL_dsq8Oas/TSdIrfOL-mI/AAAAAAAAA-c/7zt3fc0kt3Y/choose-gadget.png" border="0" alt="" /></a>
</li>
<li> Add a title, like "CC License", paste the HTML of your license button that you copied from the CC website into the contents box, and click "Save".
<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_5SL_dsq8Oas/TSdIyfEssNI/AAAAAAAAA9g/D2mTjlVVReA/s1600/html-dialogue.png"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 387px;" src="http://3.bp.blogspot.com/_5SL_dsq8Oas/TSdIyfEssNI/AAAAAAAAA9g/D2mTjlVVReA/s400/html-dialogue.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5559492297184030930" /></a>
</li>
<li><span style="font-weight:bold;">Optional:</span> You'll be back at the design editor; double-click the new CC License gadget and move it below your Blog Posts gadget (or some other fitting area).
<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_5SL_dsq8Oas/TSdI5TWJe-I/AAAAAAAAA9o/Dr_ceOjWmmA/s1600/click-drag-license.png"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 221px;" src="http://4.bp.blogspot.com/_5SL_dsq8Oas/TSdI5TWJe-I/AAAAAAAAA9o/Dr_ceOjWmmA/s400/click-drag-license.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5559492414295079906" /></a>
</li>
</ol>
<p>That's it! Your shiny new CC license button should appear where you placed it.</p>
<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_5SL_dsq8Oas/TSdI_9UrT3I/AAAAAAAAA9w/h3_AjDdarw8/s1600/cc-button-added.png"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 337px;" src="http://4.bp.blogspot.com/_5SL_dsq8Oas/TSdI_9UrT3I/AAAAAAAAA9w/h3_AjDdarw8/s400/cc-button-added.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5559492528642412402" /></a>Anonymoushttp://www.blogger.com/profile/01078483442220289712noreply@blogger.com9tag:blogger.com,1999:blog-6363246379112249261.post-76582269805026372352010-08-17T17:36:00.009-04:002011-06-25T20:09:16.664-04:00Cell on Wheels: Famous Scientist Roller Derby Names<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.flickr.com/photos/minette_layne/935380079/" title="Valtron 3000 by Minette Layne, on Flickr"><img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer; width: 240px; height: 160px;" src="http://farm2.static.flickr.com/1199/935380079_ffcab7845f_m.jpg" width="240" height="160" alt="Valtron 3000" /></a>
<p>This past Sunday I had an exciting first <a href="http://en.wikipedia.org/wiki/Roller_derby">Roller Derby</a> experience when I went out to support the <a href="http://www.facebook.com/pages/Christiansburg-VA/NRV-Roller-Girls/35743497161">NRV Rollergirls</a> in their bout of against the <a href="http://masondixonrollervixens.com/">Mason Dixon Roller Vixens</a>. For those unfamiliar with modern roller derby, it is a contact, point-based sport in which the players of two teams skate in a circuit, trying to help their point-scorer, designated a "jammer", pass the other team, whilst simultaneously using any blunt part of their bodies above the knees to prevent the other team's jammer from passing them. The rest of the details you can pick up as you watch.</p>
<p>Today, roller derby is largely an all-women's sport, where men play supporting roles as coaches and referees, which is sort of an interesting role-reversal. My favorite part of the roller derby culture is that all participants don <span style="font-style:italic;"><a href="http://en.wikipedia.org/wiki/Pseudonym">noms de guerre</a></span>, which usually involve clever (or even tacky) wordplay, including the use of <a href="http://en.wikipedia.org/wiki/Homophone">homophones</a>, <a href="http://en.wikipedia.org/wiki/Oronym">oronyms</a>, and <a href="http://en.wikipedia.org/wiki/Portmanteau">portmanteaus</a>, to spin references to pop culture, history, or anything otherwise generally familiar, with a violent, aggressive bent. For example, my favorites for the NRV Rollergirls are <a href="http://en.wikipedia.org/wiki/Adventures_of_Huckleberry_Finn">Huck Finish Her</a> and <a href="http://en.wikipedia.org/wiki/Eleanor_Roosevelt">Eleanor Blows B. Dealt</a>, but other good examples include <a href="http://en.wikipedia.org/wiki/Baby_Ruth">Baby Ruthless</a> and <a href="http://en.wikipedia.org/wiki/Buddy_holly">Bloody Holly</a> from the enjoyable film <a href="http://www.imdb.com/title/tt1172233/">"Whip It"</a>, or <a href="http://www.nytimes.com/2007/11/17/nyregion/17about.html">Hyper Lynx, Auntie Christ</a>, <a href="http://www.nytimes.com/2009/02/01/magazine/01Derby-t.html">Beyonsláy, and Nina Millimeter</a> who have been appeared in various articles in the New York Times.</p>
<p>As I lay awake Monday, unable to sleep with anxieties about upcoming presentations, needing to develop an entirely different computational approaches for research, and general insecurities about my place in life, I started thinking about how amusing derby names are, and then tried inventing some of my own. Then I had a revelation that it would be hilarious if there were derby names based off of (relatively) famous scientists. Once I got a few, I started jotting them down. Here's a list of ones I've come up with, so far:</p>
<ul> <li><a href="http://en.wikipedia.org/wiki/Stephen_Hawking">Stephen Knocking</a></li> <li><a href="http://en.wikipedia.org/wiki/J._Robert_Oppenheimer">J. Robert Uppyurheimer</a></li> <li><a href="http://en.wikipedia.org/wiki/James_watson">James Swat Some</a></li> <li><a href="http://en.wikipedia.org/wiki/Julius_Robert_von_Mayer">Julius Robert von Slayer</a></li> <li><a href="http://en.wikipedia.org/wiki/Louis_Pasteur">Louis Passed U</a></li> <li><a href="http://en.wikipedia.org/wiki/Archimedes">Archimelees</a></li> <li><a href="http://en.wikipedia.org/wiki/Aristotle">Aristhrottle</a></li> <li><a href="http://en.wikipedia.org/wiki/Niels_Bohr">Kneel Spore</a></li> <li><a href="http://en.wikipedia.org/wiki/Max_Planck">Mack's Plank</a></li> <li><a href="http://en.wikipedia.org/wiki/Carl_sagan">Carl Satan</a></li> <li><a href="http://en.wikipedia.org/wiki/Jonas_Salk">Jonas Shock 'n' Awe</a></li> <li><a href="http://en.wikipedia.org/wiki/Alexander_Graham_Bell">Alexander Grand Hell</a></li> <li><a href="http://en.wikipedia.org/wiki/William_Ramsay">William Rams Ye</a></li> <li><a href="http://en.wikipedia.org/wiki/Ernest_Rutherford">Earnest Rougher Foot</a></li> <li><a href="http://en.wikipedia.org/wiki/Paul_Dirac">Paul D. Rock</a></li> </ul>
<p>I would like to point out the obvious that these are all plays ot men's names, which is ironic given that roller derby is played predominantly by women. This is disheartening for three reasons:</p>
<ol>
<li>I could only come up with four "famous" female scientists offhand: <a href="http://en.wikipedia.org/wiki/Marie_Curie">Marie Curie</a>, <a href="http://en.wikipedia.org/wiki/Jane_Goodall">Jane Goodall</a>, <a href="http://en.wikipedia.org/wiki/Rosalind_Franklin">Rosalind Franklin</a>, and <a href="http://en.wikipedia.org/wiki/Lynn_Margulis">Lynn Margulis</a>.</li> <li>I couldn't come up with a clever spin on any of them.</li> <li>Did I mention I could come up with only <span style="font-style:italic;">four</span> famous scientists who are women? This reflects poorly on me, but I think also on the inequality that exists in scientific education and scientific research, both of yesteryear but also today. This is another issue for another blog post.</li>
</ol>
<p>If you have any suggestions for scientists I've missed (particularly famous women who are or were scientists), or better suggestions for the ones I've attempted to spin, I encourage you to post them in the comments, or put them in your own blog and post the link below.</p>
<p><span style="font-weight:bold;">Update 2011-05-17</span>: Randall Munroe published a relevant comic on the final points on female scientists:
<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://xkcd.com/896/"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 471px; height: 782px;" title="Marie Curie by Randall Munroe, on xkcd" src="http://imgs.xkcd.com/comics/marie_curie.png" border="0" alt="" /></a></p>Anonymoushttp://www.blogger.com/profile/01078483442220289712noreply@blogger.com2tag:blogger.com,1999:blog-6363246379112249261.post-64358243119054046052010-01-21T13:32:00.009-05:002011-06-25T20:13:16.745-04:00Interactive sandboxes: using IPython with virtualenv<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.flickr.com/photos/benmcleod/213005390/" title="sandbox baby by Ben McLeod, on Flickr"><img style="margin: 0pt 10pt 10px 0px; float: left; cursor: pointer; width: 152px; height: 250px;" src="http://farm1.static.flickr.com/91/213005390_20bf80b61e.jpg" width="333" height="500" alt="sandbox baby" /></a>
<p>A very helpful <a href="http://blog.ufsoft.org/2009/1/29/ipython-and-virtualenv">blog post on IPython and virtualenv</a> by <a href="http://ufsoft.org/">Pedro Algarvio</a> inspired this one. The advice found there takes you 90% to where you want. I'll recap on that 90% but explain and give the extra 10%. I am indebted to Pedro for laying down all the hard work.</p>
<p>First of all, if you are unfamiliar with Ian Bicking's <a href="http://pypi.python.org/pypi/virtualenv">virtualenv package</a>, you should know two things about it:</p>
<ol>
<li>virtualenv allows you to develop in sane, aseptic, "sandbox" development environments, switch between them seamlessly, and maintain harmonious order in your Python universe.</li>
<li>virtualenv is certifiably awesome. Proceed directly to installing it (especially in combination with <a href="http://pip.openplans.org/">pip</a>)! <a href="http://www.google.com/search?q=%22do+not+pass+go%22">Do not pass Go! Do not collect $200!</a></li>
</ol>
<p>Arthur Koziel already wrote <a href="http://arthurkoziel.com/2008/10/22/working-virtualenv/">a really good tutorial on using virtualenv</a>, and, in fact, you'll probably find working with <a href="http://www.doughellmann.com/">Doug Hellman</a>'s excellent <a href="http://www.doughellmann.com/projects/virtualenvwrapper/">virtualenvwrapper</a> more convenient; in this case, Doug already wrote <a href="http://www.doughellmann.com/articles/pythonmagazine/completely-different/2008-05-virtualenvwrapper/index.html">an excellent virtualenvwrapper tutorial</a>, too. I've mentioned <a href="http://ipython.scipy.org/">IPython</a> in <a href="http://igotgenes.blogspot.com/2009/01/tab-completion-and-history-in-python.html">a previous blog post</a>, so I won't cover that here, either. Instead, let's cut to the chase and get IPython and virtualenv playing well together.</p>
<p>Ordinarily, IPython, commonly installed system-wide by your preferred package management system, remains oblivious of an activated virtualenv environment, and will just mill about importing packages and modules from the system, rather than the sandbox. This gives two obvious solutions: either 1) configure the system installation of IPython to work with virtualenv, or 2) install IPython in each virtualenv environment. Doug Hellman wrote a <a href="http://www.doughellmann.com/articles/pythonmagazine/completely-different/2008-02-ipython-and-virtualenv/index.html">nice tutorial</a> on doing the latter approach; here, we'll focus on the former, which I prefer, since it means having to only install IPython once.</p>
<p>IPython (being a Python program) can read and execute Python scripts during launch; we'll use this mechanism to modify IPython's launch to hook into the virtualenv environment we're currently in. First, we'll tell IPython that we want to execute some code in a at startup. If we go to the <span style="font-family: monospace">$HOME/.ipython/</span> directory, we'll find a file called <span style="font-family: monospace">ipy_user_conf.py</span>. Open the file in your editor of choice, locate the function <span style="font-family: monospace">main()</span>, and at the within that function (I suggest at the end), insert the following line:</p>
<pre class="brush: shell">
execf('~/.ipython/virtualenv.py')
</pre>
<p>Next, we need to create this file. Still in the <span style="font-family: monospace">$HOME/.ipython/</span> directory, create a new file called <span style="font-family: monospace">virtualenv.py</span> and open it with your editor. Next, add these contents to this file:</p>
<pre class="brush: python">
import site
from os import environ
from os.path import join
import sys
if 'VIRTUAL_ENV' in environ:
virtual_env = join(environ.get('VIRTUAL_ENV'),
'lib',
'python%d.%d' % sys.version_info[:2],
'site-packages')
# Remember original sys.path.
prev_sys_path = list(sys.path)
site.addsitedir(virtual_env)
# Reorder sys.path so new directories at the front.
new_sys_path = []
for item in list(sys.path):
if item not in prev_sys_path:
new_sys_path.append(item)
sys.path.remove(item)
sys.path[1:1] = new_sys_path
print 'VIRTUAL_ENV ->', virtual_env
del virtual_env
del site, environ, join, sys
</pre>
<p>If you took a look at <a href="http://blog.ufsoft.org/2009/1/29/ipython-and-virtualenv">Pedro's version of <span style="font-family: monospace">virtualenv.py</span></a>, you'll recognize most of his code here. The important difference lies in the trickery we play with <span style="font-family: monospace">sys.path</span> in lines 12 through 22. These lines were inspired by a <a href="http://code.google.com/p/modwsgi/wiki/VirtualEnvironments">solution</a> to a problem presented by using <a href="http://docs.python.org/library/site.html#site.addsitedir"><span style="font-family: monospace">site.addsitedir()</span></a>, which adds new paths only to the end of <span style="font-family: monospace">sys.path</span>.</p>
<p>Adding paths to the end of <span style="font-family: monospace">sys.path</span> has, for our purposes, the undesirable side-effect of allowing system-wide packages and modules to preempt locally installed ones, since Python searches through <span style="font-family: monospace">sys.path</span> for modules and packages in first-to-last order. I have filed a <a href="http://bugs.python.org/issue7744">feature request for <span style="font-family: monospace">site.addsitedir()</span> to allow inserting new paths at the beginning of <span style="font-family: monospace">sys.path</span></a>; in the meantime, we'll use this hack inspired by the modwsgi programmers, which keeps track of the paths before and after the call to <span style="font-family: monospace">site.addsitedir()</span>, then swaps the position of the new paths from the end, to just after the first element, <span style="font-family: monospace">''</span>, which represents the current working directory (which should preempt every other path).</p>
<p>IPython will have access to the contents of the virtualenv sandbox in which you're currently working. For example, if I activate my <span style="font-family: monospace">networkx</span> virtual environment, which has the latest development version of the <a href="http://networkx.lanl.gov/">NetworkX graph library</a>, then fire up IPython, I get the following result (note the line that begins with <span style="font-family: monospace">VIRTUALENV</span> indicating I'm accessing the virtualenv sandbox):</p>
<pre>
(networkx)$ ipython
VIRTUAL_ENV -> /home/lasher/.virtualenvs/networkx/lib/python2.6/site-packages
Python 2.6.2 (release26-maint, Apr 19 2009, 01:56:41)
Type "copyright", "credits" or "license" for more information.
IPython 0.9.1 -- An enhanced Interactive Python.
? -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help -> Python's own help system.
object? -> Details about 'object'. ?object also works, ?? prints more.
In [1]: import networkx
In [2]: networkx.__version__
Out[2]: '1.1.dev1518'
</pre>
<p>When I leave the sandbox (e.g., by using virtualenvwapper's <span style="font-family: monospace">deactivate</span> command), I return to accessing the system-wide default install of NetworkX:</p>
<pre>
$ ipython
/var/lib/python-support/python2.6/IPython/Magic.py:38: DeprecationWarning: the sets module is deprecated
from sets import Set
Python 2.6.2 (release26-maint, Apr 19 2009, 01:56:41)
Type "copyright", "credits" or "license" for more information.
IPython 0.9.1 -- An enhanced Interactive Python.
? -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help -> Python's own help system.
object? -> Details about 'object'. ?object also works, ?? prints more.
In [1]: import networkx
In [2]: networkx.__version__
Out[2]: '0.36'
</pre>
<p>So there you have it: one IPython to rule all your virtualenv sandboxes!</p>Anonymoushttp://www.blogger.com/profile/01078483442220289712noreply@blogger.com6tag:blogger.com,1999:blog-6363246379112249261.post-77310762921876012072009-11-13T22:54:00.002-05:002011-06-25T20:14:08.326-04:00Time out: deterring brute force SSH attacks with iptables<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.flickr.com/photos/macca/2051553911/" title="Brute Force by macca, on Flickr"><img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer; width: 204px; height: 250px;" src="http://farm3.static.flickr.com/2418/2051553911_08517c3876.jpg" alt="Brute Force" height="500" width="408" /></a>
<p>These are some simple <span style="font-family: courier new, monospace;">iptables</span> rules I keep around on my firewall to deter brute force SSH attacks. The original idea came from <a href="http://www.linkedin.com/in/dominikborkowski">Dominik Borkowski</a>, a sysadmin at <a href="http://www.vbi.vt.edu/">VBI</a>.</p>
<p>If the attacker attempts more than 4 connections within a minute, these rules temporarily blacklist them for the next minute—or as I like to say, "put them in time-out". Their packets will be dropped; to them, it will seem that the machine simply disappeared from the intarwebs. The rules will also log such violators to your syslog. I've found them very effective. Most scripts that these crackers run will drop off after one iteration and look for lower hanging fruit.</p>
<p>Of course, if you forget your password, or have a habit of making a couple of simultaneous connections to your computer, the door will shut on you, too, but the good news is that you'll only be blocked for a minute. More draconian methods that append to actual blacklists have a habit of locking their owners out. (Not that I'm speaking from personal experience <span style="font-style: italic;">at all</span>.) The rules escape this pitfall but will prove just as effective.</p>
<pre class="brush: bash">
## Below includes very successful deterrents for SSH brute force
## that allows a maximum of 4 connection attempts within a minute.
iptables -A INPUT -p tcp -m state --state NEW --dport 22 -m recent --name sshattack --set
iptables -A INPUT -m recent --name sshattack --rcheck --seconds 60 --hitcount 4 -m limit --limit 4/minute -j LOG --log-prefix 'SSH attack: '
iptables -A INPUT -m recent --name sshattack --rcheck --seconds 60 --hitcount 4 -j DROP
iptables -A INPUT -p tcp -m tcp --dport 22 -j ACCEPT
</pre>
<p>I keep this in a firewall (shell) script that controls iptables rules and executes on bootup. If there's sufficient demand, I can make the entire script available.</p>Anonymoushttp://www.blogger.com/profile/01078483442220289712noreply@blogger.com1tag:blogger.com,1999:blog-6363246379112249261.post-70385399968015954202009-07-01T16:10:00.005-04:002011-06-25T20:15:10.745-04:00"Who arrre you?" Getting the hostname back in the Jaunty GDM greeter<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.flickr.com/photos/stelling/2361294634/" title="Remendos - Patchwork by ®oberto's, on Flickr"><img style="margin: 0pt 10pt 10px 0px; float: left; cursor: pointer; width: 167px; height: 250px;" src="http://farm4.static.flickr.com/3267/2361294634_9855d05b1a.jpg" width="334" height="500" alt="Remendos - Patchwork" /></a>
<p>The latest <a href="http://www.ubuntu.com/">Ubuntu</a> release, 9.04, codename "<a href="http://www.ubuntu.com/products/whatisubuntu/904features/">Jaunty Jackelope</a>", has turned out to be one of the best, maybe even on par with the "Gutsy Gibbon" release. The aesthetics definitely got some love; for example, if you're not running the "Dust" theme, you're missing out. [Hint: go to <span style="font-style:italic;">Preferences</span> -> <span style="font-style:italic;">Appearance</span> -> <span style="font-style:italic;">Theme</span> and select "Dust"] The GDM greeter login screen looks the best of any Ubuntu release.</p>
<p>Unfortunately, a little bit of usability got lost along the way; most notably, the hostname no longer appears anywhere on the graphical login. This probably bothers only a minority of people, but our lab, for example, just updated all its machines to Jaunty, and we couldn't tell from the greeters which machine belonged to which hostname without logging in. I sat down this afternoon for a few minutes to figure out how the GDM themes work. It turns out they're just coded as fairly simple XML, and looking at other themes, I eventually figured out what to tweak. <a href="http://gotgenes.com/files/hostname_patch_for_Human.xml.patch">This patch</a> will bring back the beloved hostname to the GDM login.</p>
<p>To use this patch, just do</p>
<pre>sudo patch -p0 < /path/to/hostname_patch_for_Human.xml.patch</pre>
<p>Now you'll no longer have to look at login screens and wonder, <a href="http://www.youtube.com/watch?v=v39qfgJQOYw">"Who arrre you?"</a></p>
<p><span style="font-weight:bold;">Update (16:17):</span> Apparently Blogger's software won't allow XML in their pre tags, so I just hosted the patch on my server instead. All the more reason why I need to host my own blog with Wordpress or something soon...</p>Anonymoushttp://www.blogger.com/profile/01078483442220289712noreply@blogger.com0tag:blogger.com,1999:blog-6363246379112249261.post-80307795263284207412009-06-28T19:57:00.007-04:002011-06-25T20:16:20.003-04:00Is it in one's Nature?<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.flickr.com/photos/joka2000/137294320/" href="http://www.flickr.com/photos/h-k-d/2837128711/" title="Bonsai Moon by h.koppdelaney, on Flickr"><img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer; height: 175px; width: 250px" src="http://farm4.static.flickr.com/3275/2837128711_59740ee027.jpg" width="500" height="351" alt="Bonsai Moon" /></a>
<p>Today is Sunday, a day of rest to some, but to heathens with Monday meetings like myself, a day of catching up and doing all the things we thought we'd get done earlier. Unfortunately for me, our LDAP server that gives us access to the network is down... again... for the third weekend in a row, preventing access to our workstations, data, and worst for me, my research notebook, which I keep on our group's wiki. I admittedly felt a strong temptation to get out, enjoy the sunshine, and play a little guitar, but here I sit, in the cold, gray, fluorescent-lit cube. I'm here because I'm trying to be less incompetent as a scientific researcher.</p>
<p>One of the things that particularly makes me feel incompetent is my lack of knowledge of scientific literature, and (to a greater extent?) my lack of enthusiasm for reading it. I don't know why, and I give myself grief for this, but I often find reading scientific papers just plain boring. The funny thing is, I really appreciate <span style="font-style: italic;">science</span>, by which I mean the technique of elucidating one's knowledge of the world through rigorous, reproducible means, and keeping a skeptical mindset, <span style="font-style: italic;">especially</span> when it comes to one's own work. Likewise, I will never cease to find biology or computational technology among the most satisfactory pursuits for the very limited time and energy I have here on this good Earth. Yes, science, itself, is awesome, but the excitement of it gets stripped away in a lot of formal education environments, and for me, in the way scientists present it in their formal literature.</p>
<p>I have to qualify that last statement as pertaining to myself because I have colleagues who clearly find the literature still stimulating; a good example is <a href="http://arjun.krish.googlepages.com/">Arjun Krishnan</a>. At any given point, Arjun can tell you a few relevant papers he's read on seemingly any subject, he can give you solid summaries, and he turns it into good research questions, some of which he's following up on. He's a paragon of the Good Graduate Student; I have no doubts Arjun is going to be a superstar scientist in whatever field he ends up in, if not in general. I am certainly no Arjun, however, so I have to focus on humbler goals.</p>
<p>One of our tasks as students in the <a href="http://people.cs.vt.edu/~murali/">Murali group</a> is to canvas over a dozen of the journals in bioinformatics and computational biology and scout for pertinent articles. I decided to use my "downtime" to have at the growing stack of journal headlines in my RSS feeds, and since I needed a place to start, I thought I'd tackle my <a href="http://www.nature.com/">Nature</a> stack, which I'd neglected since the end of May. This meant a back log of over two hundred articles. I scanned through each headline, pausing at ones that had life sciences subjects, opening up a few that had keywords that caught my attention, taking a genuine look at a few of those, and skipping over the rest. At the end of the process, I felt really disappointed.</p>
<p>Of the several hundred articles, I only wound up reading <a href="http://www.nature.com/nature/journal/v459/n7247/full/459619a.html">three</a> <a href="http://www.nature.com/nature/journal/v459/n7247/full/459619c.html">research</a> <a href="http://www.nature.com/nature/journal/v459/n7249/full/459893d.html">highlights</a>, the abstract of <a href="http://www.nature.com/nature/journal/v459/n7250/full/nature08062.html">one letter</a>, the abstract and some of the figures in <a href="http://www.nature.com/nature/journal/v459/n7250/full/nature08182.html">another</a>, and the abstract and some of the methods in <a href="http://www.nature.com/nature/journal/v459/n7249/full/nature08021.html">another</a>, and <span style="font-style:italic;">none</span> of these proved at all pertinent to research I am supposed to be doing now.</p>
<p>Worth pointing out more, at no time did I read the title of a full-fledged research article and think, "Wow, I should read that," or even, "Gee, that sounds interesting." The vast majority of the titles just struck me as extremely esoteric, and this confuses me the most. Aren't Nature, Science, and PNAS supposed to have articles that are of interest not just to a specific field, but to the entire scientific community? But you know, I'm not interested that <a href="http://www.nature.com/nature/journal/v459/n7250/abs/nature08109.html">"GOLPH3 modulates mTOR signalling and rapamycin sensitivity in cancer"</a>, or that <a href="http://www.nature.com/nature/journal/v459/n7248/abs/nature08085.html">"Histone H4 lysine 16 acetylation regulates cellular lifespan"</a>, or in <a href="http://www.nature.com/nature/journal/v459/n7249/abs/nature08104.html">"A newly discovered protein export machine in malaria parasites"</a>. I fail to feel these discoveries shaking my perception of the world around me, of giving me a new topic to explore, or helping me make my own discoveries.</p>
<p>Nature is a journal that can make tenure, a journal where scientists experience great thrills for getting in and great envy when their colleagues do, a journal that says, "I publish <a href="http://www.hulu.com/watch/66312/saturday-night-live-digital-short-like-a-boss">like a boss</a>!" It's a journal where <span style="font-style:italic;">my</span> boss says, "You should be reading it anyway." So obviously, like so many things in scientific research, I just don't get it. And now, after my attempt to gain a little face today, I'm right back to where I started. Go ahead, just say it—I'm the worst scientist in the world. I'm a <a href="http://www.youtube.com/watch?v=Fig956-MuVA">cotton-headed ninny muggins</a>.</p>Anonymoushttp://www.blogger.com/profile/01078483442220289712noreply@blogger.com3tag:blogger.com,1999:blog-6363246379112249261.post-31381585514884104952009-04-30T05:09:00.011-04:002011-06-25T20:18:16.473-04:00How symbolic: on removing symlinks in Bazaar VCS<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.flickr.com/photos/darwinbell/465459020/" title="the weakest link by Darwin Bell, on Flickr"><img style="margin: 0pt 10pt 10px 0px; float: left; cursor: pointer; width: 190px; height: 250px;" src="http://farm1.static.flickr.com/207/465459020_d8a492a31f.jpg" alt="the weakest link" width="381" height="500" /></a>
<p>I have a <a href="http://igotgenes.blogspot.com/2009/03/why-biopython-needs-to-move-to-github.html">strong affinity</a> for <a href="http://betterexplained.com/articles/a-visual-guide-to-version-control/">distributed revision control systems</a>, and my favorite has been <a href="http://bazaar-vcs.org/">Bazaar VCS</a> (a.k.a., bzr). Like any piece of software, bzr has its quirks and shortcomings. Tonight, I encountered its rather tricky behavior when it comes to <a href="http://en.wikipedia.org/wiki/Symbolic_link">symbolic links</a> (symlinks).</p>
<p>I keep <a href="https://code.launchpad.net/%7Echris.lasher/+junk/shell-configs">my configurations</a> under revision control, which gives me the benefits of rolling back changes when I inevitably break things, and of setting up home on a new system, even a remote one, very quickly and easily. All was well, but I discovered that when I naively placed my <span style="font-family:courier new;">.vim/</span> directory under the repository, I added a ton of symlinks to files in <span style="font-family:courier new;">/usr/share/vim/addons</span>. These symlinks were present because I used Ubuntu's <a href="http://packages.ubuntu.com/jaunty/vim-scripts">vim-scripts</a> and <a href="http://packages.ubuntu.com/jaunty/vim-addon-manager">vim-addon-manager</a> packages to install these addons to my Vim profile, which essentially just sets up symlinks to the addons, stored in <span style="font-family:courier new;">/usr/share</span>. It's a pretty reasonable system, actually, but it doesn't make sense to have these symbolic links stored in my branch. I can't guarantee that each system I work on will have the files the symlinks point to, therefore, I thought it best to remove them. Therein I encountered a sticky issue with bzr: you really can't remove symlinks from its revision tracking easily.</p>
<p>I thought I could be clever and write a simple one-liner in <a href="http://en.wikipedia.org/wiki/Bash">Bash</a> to remove all the symlinks presently tracked by bzr from further tracking, but still leave them on the file system (I still need the symlinks there, after all, or my Vim goodness will break).</p>
<pre class="brush: python">
for file in `bzr ls -V`; do # use bzr ls -VR in later versions
if [ -h $file ]; then # see if the file is a symlink
echo "Removing $file";
bzr rm --keep $file; # remove from tracking, not the FS
fi;
done
</pre>
<p>Okay, so I reformatted it for annotation, but trust me, it fits on one line. Anyway, I immediately encountered problems, getting this as output:</p>
<pre>
.vim/compiler/tex.vim
bzr: ERROR: Not a branch: "/usr/share/vim/addons/compiler/tex.vim/".
.vim/doc/NERD_commenter.txt
bzr: ERROR: Not a branch: "/usr/share/vim-scripts/doc/NERD_commenter.txt/".
.vim/doc/bufexplorer.txt
bzr: ERROR: Not a branch: "/usr/share/vim-scripts/doc/bufexplorer.txt/".
.vim/doc/imaps.txt.gz
bzr: ERROR: Not a branch: "/usr/share/vim/addons/doc/imaps.txt.gz/".
.vim/doc/latex-suite-quickstart.txt.gz
...
</pre>
<p>WTF? "Not a branch!?"</p>
<p>Okay, so, what happens here is that Bazaar de-references the symlink before attempting to remove it, which is not at all what I had in mind. Poking around <a href="https://launchpad.net/">Launchpad</a>, you can find <a href="https://bugs.launchpad.net/bzr/+bug/257665">several</a> <a href="https://bugs.launchpad.net/bzr/+bug/186194">bug</a> <a href="https://bugs.launchpad.net/bzr/+bug/128562">reports</a> <a href="https://bugs.launchpad.net/bzr/+bug/236149">regarding</a> the way Bazaar deals with symlinks. The workaround solutions proposed in those—remove the symlink using <span style="font-family:courier new;">rm</span>—wouldn't work for me, because I needed to retain the actual symlinks on the filesystem.</p>
<p>At this point I had solicited the attention of <a href="https://launchpad.net/%7Elifeless">Robert Collins</a>, a.k.a. lifeless in #bzr on <a href="http://freenode.net/irc_servers.shtml">Freenode</a>. When I told him the workaround wouldn't work for me, and that I'd need to write a script, he suggested I use <span style="font-family:courier new;">WorkingTree.unversion()</span> from <span style="font-family:courier new;">bzrlib</span>. Despite being a Python fanatic [understatement] and bzr's codebase being in Python, when I said "script", I meant "Bash script". It never occurred to me to actually write a Python script until he mentioned that. By the completion of the thought, though, I was digging into the codebase of <span style="font-family:courier new;">bzrlib</span> to figure out what to do.</p>
<p>My initial approach plan included using <a style="font-family: courier new;" href="http://docs.python.org/library/os.html#os.walk">os.walk()</a> to move through the filesystem, <a style="font-family: courier new;" href="http://docs.python.org/library/os.path.html#os.path.islink">os.path.islink()</a> to identify the symbolic links, and then <span style="font-family:courier new;">WorkingTree.unversion()</span> to mark the files for removal from tracking. I ran into a problem, however, in that <span style="font-family:courier new;">unversion()</span> only accepts a list of file IDs, as specified by the bzr metadata. Robert pointed me towards a method called <span style="font-family:courier new;">path2ids()</span>, but I had trouble figuring out how I was going to give it the proper paths. os.walk will let me construct absolute paths to files, but I really needed relative paths to the files, truncated at a certain point past the root (e.g., <span style="font-family:courier new;">.vim/compiler/tex.vim</span> instead of <span style="font-family:courier new;">/home/chris/shell-configs/.vim/compiler/tex.vim</span>). I could see it was getting a little hairy, so I decided to dig a little further into <span style="font-family:courier new;">WorkingTree</span> code and see if there was anything else I could use.</p>
<p>What I discovered was the jackpot in the form of <span style="font-family:courier new;">WorknigTree.walktree()</span>—a method written precisely for what I needed: traversing the filesystem, identifying the filetypes (especially symlinks), and providing file IDs. Within a few minutes, I banged out a script that did exactly what I needed it to do, presented below.</p>
<pre class="brush: python">#!/usr/bin/env python
# -*- coding: UTF-8 -*-
# Copyright (c) 2009 Chris Lasher
#
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
"""
A simple script to go through a Bazaar repository and ruthlessly
remove all symbolic links (symlinks) from further tracking.
It's important to note that this will not actually remove the symlinks
from the physical filesystem. This is left to the user, if so desired.
"""
__author__ = 'Chris Lasher'
__email__ = 'chris DOT lasher AT gmail DOT com'
import bzrlib.workingtree
import os
tree = bzrlib.workingtree.WorkingTree.open(os.getcwd())
try:
# use protection -- one-on-one action only
tree.lock_write()
symlink_ids = []
for dir, file_list in tree.walkdirs():
# dir[1] (the file_id) will be None if it's not under revision
# control, so this will skip it if it's not
if dir[1]:
for file_data in file_list:
# file_data[2] is the file type, and file_data[4] is the
# file_id, the necessary specifier for removing the file
# from revision tracking
if file_data[2] == 'symlink' and file_data[4]:
print "Removing %s" % file_data[0]
symlink_ids.append(file_data[4])
tree.unversion(symlink_ids)
finally:
# okay, all yours
tree.unlock()
</pre>
<p>Hopefully someone else will find this little script useful. It's under the Apache version 2 license; make whatever use of it you can for your particular predicament.</p>
<p>So what were the lessons learned here:
<ol><li>Exercise a little restraint and consideration about what you put under revision control in the first place.</li><li>It's awesome to be able to have direct contact with developers of your tools.</li><li>It's even more awesome to be able to dig right into their code and help yourself.</li><li>Just like in <a href="http://people.cs.vt.edu/%7Emurali/">Murali's</a> brutal <a href="http://courses.cs.vt.edu/%7Ecs5114/spring2008/">Theory of Algorithms course</a>, in real life, when facing difficulty solving a problem one way, don't be afraid to step back and try an approach from another (the opposite) direction. Trust your gut—if it feels like the hard way of doing something, it probably is; find the lazy (smart) way.</li></ol></p>
<p>A special thanks to Robert for his guidance and help.</p>Anonymoushttp://www.blogger.com/profile/01078483442220289712noreply@blogger.com1tag:blogger.com,1999:blog-6363246379112249261.post-34528403404150808282009-03-21T00:32:00.005-04:002011-06-25T20:19:40.531-04:00Do you choose the research, or does the research choose you?<p><span style="font-style: italic;">Note: I give advanced warning to my non-biologist readers that the next few paragraphs below contain a good dose of biology. While I have attempted to keep it conversational, if you feel your eyes glazing over, skip down a few paragraphs for the real meat.</span></p>
<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.flickr.com/photos/joka2000/137294320/" title="to the air by joka2000, on Flickr"><img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer; width: 250px; height: 166px;" src="http://farm1.static.flickr.com/55/137294320_81b08da74a.jpg" alt="to the air" width="500" height="332" /></a>
<p>Yesterday (Friday) I attended a talk by <a href="http://ccr.cancer.gov/staff/staff.asp?profileid=5786">Susan Gottesman</a> about <a href="http://en.wikipedia.org/wiki/Non-coding_RNA">small non-coding RNAs (sRNAs)</a> and how they are involved in protein degradation. At this point in her esteemed career, Gottesman's best known work revolves around a particular <span style="font-style: italic;">Escherichia coli</span> <a href="http://en.wikipedia.org/wiki/Sigma_factor">sigma factor</a>—a protein responsible for <a href="http://en.wikipedia.org/wiki/Transcription_%28genetics%29">transcription</a>—called RpoS. RpoS facilitates translation of <a href="http://en.wikipedia.org/wiki/MRNA">messenger RNAs (mRNAs)</a> into proteins at low temperature levels. Now, RpoS <span style="font-style: italic;">only</span> appears in <span style="font-style: italic;">E. coli</span> cells during low temperature conditions, but mysteriously (or so it was), the gene that encodes RpoS gets expressed even when the cells are growing at a comfortable temperature.</p>
<p>As Gottesman's lab discovered, the mRNA for RpoS can actually bend back around and stick to itself such that ribosomes aren't able to bind to the mRNA and translate it into the RpoS protein. An sRNA called DsrA, however, which is expressed in low temperature conditions, binds to part of the RpoS mRNA, preventing the mRNA from folding back on itself, and giving ribosomes access to the transcript to translate it into RpoS protein. Why is this important?</p>
<p>Well, previously, sRNAs had only been thought to inhibit translation and prevent proteins from appearing. That is, we say that sRNAs usually <span style="font-style: italic;">inhibit</span> the expression of a protein, so if you found an sRNA, you would bet that its target wouldn't appear when it appeared. Add sRNA and the protein won't be found in the cell; take the sRNA away, and the protein re-appears. The Gottesman lab, however, demonstrated a case where the sRNA actually is responsible for making the proteins appear. That is, when the sRNA DsrA appears, its target, RpoS, appears too; and if you take away DsrA, the protein goes away, too! Craziness! In Biology, we call this a paradigm shift. Paradigm shifts are "big deals", because Biology is all about figuring out the rules, and then identifying the exceptions so we have to re-write the rules. Biology is the science of exceptions.</p>
<p>The story continues, but I'll leave it to the reader to check out <a href="http://www.ncbi.nlm.nih.gov/sites/entrez?db=pubmed&cmd=search&term=gottesman+susan">Gottesman's publications</a> for more, because as much as I liked the story of her research, what I found most interesting about the talk was this side comment that she made towards the end, which I paraphrase here:
<blockquote>We published this work with RpoS, but then we wanted to work in other directions. We'd try something, then discover we couldn't go in that direction because we needed to know something else about RpoS. Then we'd attempt something else, but again, it would always come back to RpoS. Finally we just said, "Forget it! Fine! We'll just study RpoS. Clearly there's enough here to work on for a while."
</blockquote>I don't know if the fellow grad students in the audience caught the subtle significance of this statement, or if perhaps I was the only person who found this significant. What Gottesman said, in more words, is that she didn't really choose her research; her research chose <span style="font-style: italic;">her</span>. Yet, in spite of spending her career in an area she never intended to stay in, once she identified that she was mired in it, she made the best of it, leading to great scientific contributions and earning her accolades and prestige that even the most jaded of us junior researchers catch ourselves fantasizing about from time to time.</p>
<p>I find this significant because, also from time to time, I wonder how the researchers, and even my peers, that I have come to admire wound up doing the research that they're doing. In my earlier days, I often thought they must possess great foresight and wisdom. While I don't doubt they're clever people, the longer my tenure in research and the more people I harass to tell me about their own careers, the more I've begun to think that a lot of it just comes by chance rather than deliberate choice. We find ourselves in a particular unique positions, somewhat stuck, and somewhat stumped, and we throw up our hands and say, "Aw, Hell! I guess I might as well dig around while I'm here." We do have to make choices about where we dig, but we seem to get to choose our own particular holes about as well as seeds scattered by the winds. (Though, from time to time, we can<a href="http://igotgenes.blogspot.com/2008/12/today-i-quit.html"> try to ride the winds to another hole</a>.)</p>
<p>I suppose that I just find it amusing that life is stochastic from the molecular level all the way up to our own grand plans. Like each of our cells, we may as well just deal with the cards we're dealt as best we can. For everything else... well, <a href="http://www.last.fm/music/Vince+Guaraldi+Trio/_/Cast+Your+Fate+to+the+Wind">"Cast Your Fate to the Wind"</a>.</p>Anonymoushttp://www.blogger.com/profile/01078483442220289712noreply@blogger.com1tag:blogger.com,1999:blog-6363246379112249261.post-83979472664185373992009-03-16T03:34:00.002-04:002011-06-25T20:21:36.748-04:00If a tree falls in a random forest: a summary of Chen and Jeong, 2009<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.flickr.com/photos/angelrays/343449552/" title="Trees in fog w shadows by Angelrays, on Flickr"><img style="margin: 0pt 10pt 10px 0px; float: left; cursor: pointer; width: 180px; height: 250px;" src="http://farm1.static.flickr.com/125/343449552_d42fafed62.jpg" width="359" height="500" alt="Trees in fog w shadows" /></a>
<p>I had to write a summary for a paper, <a href="http://bioinformatics.oxfordjournals.org/cgi/content/full/25/5/585">"Sequence-based prediction of protein interaction sites with an integrative method" by Xue-wen Chen and Jong Cheol Jeong</a>[<a href="#chenjeong">1</a>], for my Problem Solving in Bioinformatics course. I thought I'd share the review here on my blog, in case anybody finds it remotely useful. I doubt anyone will, but it's my blog, so there. Be forewarned, this is my, "Hey, buddy, I'm just a biologist" interpretation of their paper. If you spot any specious, misleading, or just plain incorrect statements, please, by all means, offer corrections.</p>
<hr />
<p>Chen and Jeong have essentially found a method to apply a machine learning technique called random forests to predict specific binding sites on proteins given only the amino acid sequence with greater accuracy than previously existing methods. Identification of binding sites in proteins remains an important task for both basic and applied life sciences research, for these sites make possible the protein-protein and protein-ligand interactions from which phenotypes, and indeed, the properties of life emerge. These sites also serve as important drug targets for pharmaceutical research.</p>
<p>Traditionally, researchers have identified binding sites from in vivo or in vitro studies involving point mutations that affect phenotypes, as well as through analysis of protein structures as identified through protein crystallography. With the advent and continuous improvement of DNA sequencing technology, however, researchers contribute ever more knowledge in the form of amino acid sequence, rather than structures. Sequencing has rapidly outpaced crystallography, necessitating prediction of proteins' functional characteristics based solely on their amino acid sequence, which Chen and Jeong cite as the motivation behind research presented in this paper.</p>
<p>Previous efforts to infer binding sites purely from amino acid sequence used a different machine learning method called Support Vector Machine (SVM). I'm not entirely certain how SVMs operate, but like random forests, they require a training set of known binding sites and sites not involved in binding. One of the confounding factors about amino acid sequences when applied to machine learning methods like SVMs is that the residues are unevenly distributed between the two categories; in other words, few amino acids in a sequence (1 in 9 in the dataset used by Chen and Jeong) will sit at the interface of the protein and its ligand. Chen and Jeong chose to use random forests because they are robust against this bias in the data. This has to do with the way that random forests are constructed.</p>
<p>For constructing random forests, one must have a set of data. In Chen and Jeong's study, the set is comprised of amino acids belonging to 99 polypeptide chains—or chunks of proteins—culled from a protein-protein interaction set used in previous studies. One must also have a set of features, or measures, about each item in and the data set. In this study, there were 1050 features (as stored in vectors) for each amino acid, which fall into one of three categories: those measuring physical or chemical characteristics (e.g., hydrophobicity, isoelectric point, propensity—which is a fancy word for saying whether an amino acid is likely to be on the surface of a protein or buried deep within it), those measuring the amino acid's minimum distance to any other given amino acid along the sequence, and the position specific score matrix (PSSM), which has to do with how likely certain amino acid substitutions are likely to be at that point.</p>
<p>With this data set and features in hand, one feeds it to the random forest generator. To construct one random decision tree, follow a process like this:</p>
<ol><li>Count the total number of known interface sites (we'll call these positives), and call this number <span style="font-style: italic;">N</span>.</li><li>Count the number of features available, and call this number <span style="font-style: italic;">M</span>.</li><li>Randomly select a subset of <span style="font-style: italic;">N</span> sites out of the entire set with—and this is important—replacement. This solves the problem of the unbalanced data set. If I recall my statistics correctly (I don't) this has to do with each site now having equal chance at influencing the training.</li><li>Now we build the tree. We randomly select <span style="font-style: italic;">m</span> features from the total <span style="font-style: italic;">M</span> features, where <span style="font-style: italic;">m</span> is a lot smaller than <span style="font-style: italic;">M</span>. Then, of those <span style="font-style: italic;">m</span> features, we choose the one which best splits the subset of sites. We continue to do this recursively until all sites have been "classified".</li><li>We repeat steps 1-4 to construct the number of desired trees (100 in this study), which gives us our "forest" of randomly generated trees.</li></ol>
<p>With the random forest constructed, essentially you feed in an amino acid site into the random forest, then the site trickles down each tree, and each tree then "votes" as to whether or not it classified the site as an interaction site or not. A simple majority can be used to categorize the site, or more stringent criteria, such as "at least 5 votes are necessary to categorize the site as an interface site". Increasing the votes required improves the confidence at which one claims a site is an interaction site (specificity), but decreases the probability of detecting interaction sites (sensitivity).</p>
<p>Using these measures of sensitivity and specificity in conjunction with leave-one-out studies (one polypeptide sequence is used as the test case, and the other 98 are used as training data), Chen and Jeong demonstrated that their random forests approach performed significantly better than the SVM approach used by the earlier studies. They attribute this improved performance to two things: random forests are more robust to unbalanced data sets, and their approach considered many more features than the previous studies'. When they used only the features used in the previous studies, they found decreased performance, albeit still significantly better than the previous methods'. Chen and Jeong note that a major feature of random forests is that their accuracy increases, rather than decreases, when the number of features increases, due to the random sampling.</p>
<p>Chen and Jeong finished their study with a prediction of binding sites on the DnaK (or Hsp70 in eukaryotes) chaperone system. Their results corroborated with several in vivo studies of mutants where mutations near the sites they predicted yielded changes in phenotypes for both prokaryotic and eukaryotic forms. Their visualization of predicted interaction sites using 3d molecular modeling software provided additional support.</p>
<ol><li><a name="chenjeong">Xue-wen Chen</a> and Jong Cheol Jeong, "Sequence-based prediction of protein interaction sites with an integrative method," <span style="font-style: italic;">Bioinformatics</span> 25, no. 5 (March 1, 2009): 585-591, doi:10.1093/bioinformatics/btp039. <span class="Z3988" title="url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&rft_id=info%3Adoi/10.1093/bioinformatics/btp039&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Sequence-based%20prediction%20of%20protein%20interaction%20sites%20with%20an%20integrative%20method&rft.jtitle=Bioinformatics&rft.volume=25&rft.issue=5&rft.aufirst=Xue-wen&rft.aulast=Chen&rft.au=Xue-wen%20Chen&rft.au=Jong%20Cheol%20Jeong&rft.date=2009-03-01&rft.pages=585-591">
</span></li></ol>Anonymoushttp://www.blogger.com/profile/01078483442220289712noreply@blogger.com0tag:blogger.com,1999:blog-6363246379112249261.post-11969530243693798882009-03-15T01:36:00.012-04:002011-06-25T20:24:22.775-04:00Why Biopython needs to move to GitHub or Launchpad<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.flickr.com/photos/tawheedmanzoor/2458011059/" title="Air hosting? by Tawheed Manzoor, on Flickr"><img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer; width: 250px; height: 188px;" src="http://farm3.static.flickr.com/2321/2458011059_09da7c2a45.jpg" alt="Air hosting?" width="500" height="375" />
<p></a><a href="http://genedrift.org/">Paulo Nuin</a> wrote a <a href="http://python.genedrift.org/2009/03/13/biopython-and-cvs/">spot on post</a> about the ridiculousness that is <a href="http://biopython.org/">Biopython</a> still using <a href="http://www.nongnu.org/cvs/">CVS</a> as its <a href="http://en.wikipedia.org/wiki/Revision_control">revision control system</a> (a.k.a. source code management, or SCM), when we code in an era of arguably superior tools in the form of <a href="http://betterexplained.com/articles/intro-to-distributed-version-control-illustrated/">distributed SCMs (DSCMs)</a>. Please read his post if you haven't yet. <a href="http://en.wikipedia.org/wiki/Monopoly_%28game%29">Do not pass go. Do not collect $200.</a> This post will be here for you when you get back.</p>
<p>I'll continue along the thread that Paulo started, in which one of the hangups that the Biopython community must overcome is: "Supposing we do switch to a DSCM, where do we host the code?" Until the Biopython project can decide on an answer to this question, the project won't move to anything.</p>
<p><a href="http://www2.warwick.ac.uk/fac/sci/moac/currentstudents/peter_cock/">Peter Cock</a> seems <a href="http://lists.open-bio.org/pipermail/biopython-dev/2009-March/005470.html">sincerely determined</a> that the code be hosted on the <a href="http://www.open-bio.org/">Open Bioinformatics Foundation</a> (OBF) servers at <a href="http://biopython.org/">Biopython.org</a>. If I understand Peter's rationale correctly, the notion stems from the desire to maintain control of the code hosting. The alternative to self-hosting the code is to use one of the big players. I'm particularly referring to <a href="https://github.com/">GitHub</a> and <a href="https://launchpad.net/">Launchpad</a>. GitHub and Launchpad host repositories of for the DSCMs <a href="http://git-scm.com/">Git</a> and <a href="http://www.bazaar-vcs.org/">Bazaar</a>, respectively, and provide a set of tools around these repositories to facilitate collaboration and interactions between the developers and their communities. Launchpad has the backing of <a href="http://www.canonical.com/">Canonical</a>, best known for managing the <a href="http://www.ubuntu.com/">Ubuntu GNU/Linux distribution</a>, and GitHub has the backing of the only group more rabid than the Python community—the <a href="http://www.ruby-lang.org/">Ruby</a> community; hence, I refer to them as "the big players".</p>
<p>I respect Peter's legitimate concerns. I also really respect Peter, who is <a href="http://igotgenes.blogspot.com/2008/08/not-biopythonista-i-thought-id-be.html">much more of a Biopythonista than I'll ever be</a>, and I recognize it will take his blessing for the transition to Git or Bazaar to succeed. I dedicate this blog post to changing Peter's opinion and convincing him that hosting on GitHub or Launchpad is the best option available to us at the time.<a href="#lpstar">*</a> Hopefully I'll convince a few other Biopython (or Bio-anything) Devs along the way, too. :-)</p>
<p>The following are my top five reasons for hosting Biopython on GitHub/Launchpad:</p>
<ol>
<li><span style="font-weight: bold;">It's free.</span> Yeah, okay, only "as in beer"<a href="#lpstarstar">**</a>, but the Biopython source will, itself, remain open. The hosting is generously on someone else's dime, and that's all we need.</li>
<li><span style="font-weight: bold;">It already exists.</span> I do not have technical experience nor interest in running my own webserver-based interface to either Bazaar or Git. From the <a href="http://lists.open-bio.org/pipermail/biopython-dev/2009-February/005316.html">recent</a> <a href="http://lists.open-bio.org/pipermail/biopython-dev/2009-February/005214.html">discussions</a> on the Biopython mailing list, I will guess nobody on the Biopython Dev team does or has the time to learn how to, either. Since the OBF staff are volunteers, helping us set these up won't be high on their priority list. Bazaar and Git don't even exist on the servers, yet. Launchpad and GitHub already have the tools in place. The amount of time the Biopython community has to spend setting up the projects here is pretty minimal and painless. In fact, it's already <a href="http://github.com/biopython/biopython/tree">been</a> <a href="https://launchpad.net/biopython">done</a>. Launchpad and GitHub are clearly very good at what they do. They have the experts, the redundancy, and the robustness to manage hosting code in a public space, and all the headaches that come with it, so that we don't have to.</li>
<li><span style="font-weight: bold;">They have established social networks.</span> I'm already on <a href="https://github.com/gotgenes">GitHub</a> and <a href="https://launchpad.net/~chris.lasher">Launchpad</a>. A lot of us are already on these sites, working on our own and other open source projects. These places let other people discover our work, and allow serendipitous connections to occur. "Hmm, this gal works on Biopython. What's that?" This doesn't occur at Biopython.org—people only go there when they know what they're looking for (and not many people are looking for "<a href="http://lmgtfy.com/?q=bioinformatics+python">bioinformatics python</a>"). Additionally, potential employers, co-workers, and employees are on these sites; not all of us will be (un)fortunate or content enough to stay in bioinformatics and computational biology forever.</li>
<li><span style="font-weight: bold;">Everybody else is doing it.</span> Sure, right now, GitHub only hosts very minor, niche projects like <a href="http://github.com/rails/rails/tree">Ruby on Rails</a>, <a href="http://github.com/280north/cappuccino/tree">Cappuccino</a>, and <a href="http://github.com/bioruby/bioruby/tree/">BioRuby</a> (like <span style="font-style: italic;">that</span> will ever go anywhere), and Launchpad has some lesser known ones like <a href="https://launchpad.net/mysql">MySQL</a>, <a href="https://launchpad.net/zope">Zope</a>, and something called <a href="https://launchpad.net/ubuntu">Ubuntu</a>, but I hear that some major players will join these sites really soon! They do seem to be gaining in popularity very rapidly. ;-)
</li>
<li><span style="font-weight: bold;">Vendor lock-in is just not an issue.</span> There's some concern that using a third-party site such as GitHub or Launchpad will make the Biopython project vulnerable to possibly unreasonable whims of the owners of the sites. Terms and conditions could change unfavorably (e.g., "You have to pay to continue using our service."), or the service will go under. However, the OBF provides no more protection than Launchpad or GitHub, particularly for the latter scenario. When I think about who's least likely to run out of operating funding—the OBF, Launchpad, or GitHub—I'm not betting on OBF. But let's suppose that the uthinkable happens, and the site closes its doors to Biopython. So what? It's a <span style="font-style: italic;">distributed</span> SCM; we have <span style="font-style: italic;">all</span> of the code! This isn't CVS or Subversion, where a downed server takes all the revision history with it to the grave. We'll just set up shop somewhere else, point it towards our repositories, and sally on. We can burn that bridge when we get there; in the meantime, don't fret about it.</li>
</ol>
<p>At this point, I'm sure there's more discussion to have. I just hope it's not too much, given that the transition to Subversion stalled tragically, <a href="http://friendfeed.com/e/69e1c053-22c3-12ad-2264-5ba70fa41d5b/gotgenes-You-re-still-insisting-in-moving/">which I take responsibility for</a>. It would be nice to have this settled by May. I'd rather be fielding "How do I do this in Git/Bazaar?" than discussing "Why should I do this in Git/Bazaar?" My fingers are crossed, my hopes are high, and my stubornness is fiercer than two years ago.</p>
<p><a name="lpstar">*</a> I'm excluding <a href="http://www.selenic.com/mercurial/">Mercurial</a> and <a href="http://bitbucket.org/">Bitbucket</a> here because they haven't received consideration on the mailing list. They could be a great solution, but we're least familiar with them, and we have to narrow down the choices somehow.
<a name="lpstarstar">**</a> Okay, so <a href="http://arstechnica.com/open-source/news/2008/07/mark-shuttleworth-launchpad-to-be-open-source-in-12-months.ars">Launchpad is going to be open sourced</a>, but I don't want to be in charge of running an instance of it if nobody's going to pay me; see 2.</p>Anonymoushttp://www.blogger.com/profile/01078483442220289712noreply@blogger.com3tag:blogger.com,1999:blog-6363246379112249261.post-2683688420830286202009-03-11T01:35:00.011-04:002011-06-25T20:26:07.997-04:00This is a stick up! Give me all your genomes!<p><span style="font-style: italic;">This blog post is based on a <a href="http://friendfeed.com/e/b08a736b-804f-4ce3-ab73-13c8fdfa1fab/This-is-a-stick-up-Give-me-all-your/">previous entry of the same title I posted to FriendFeed</a>. This post provides an extended explanation of what we're trying to accomplish.</span></p>
<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.flickr.com/photos/dunechaser/385847284/" title="Thieves by Dunechaser, on Flickr"><img style="margin: 0pt 10pt 10px 0px; float: left; cursor: pointer; width: 250px; height: 188px;" src="http://farm1.static.flickr.com/135/385847284_b305344ffa.jpg" alt="Thieves" width="500" height="375" /></a>
<p>I'm working with <a href="http://www.ppws.vt.edu/%7Ejelesko/">Prof. John Jelesko</a> on a project for one of my courses in which he's investigating metabolic pathways in plants. At the heart of it, we need to set up a local database for running <a href="http://fasta.bioch.virginia.edu/fasta_www2/fasta_list2.shtml">FASTA homology searches</a>. The Jelesko lab wants this database to contain every amino acid sequence predicted in every currently available whole genome (assembled and annotated) available at NCBI, prokaryotic and eukaryotic. [<span style="font-style: italic;">Edit: We don't need every sequenced genome, actually, we only need a representative genome per organism. I hadn't previously considered that there may be more than one genome per organism. Thanks to Brad Chapman for pointing out the need for clarification.</span>]</p>
<p>We have sequences from locations other than NCBI which we need to include in the FASTA search space; hence, we can't just run FASTA searches over NCBI data, which <a href="http://www.ebi.ac.uk/Tools/fasta/">EBI's FASTA search</a> might be able to otherwise do. This necessitates a local database. The Jelesko lab also needs the nucleotide sequence corresponding to the amino acid sequence, as well as the intron/exon locations for the longest available splicing. The questions are: is it feasible to store this amount of data in a database (we'll be using MySQL), and if so, how do we go about getting this data?</p>
<p>We're naïvely assuming it is feasible, so I'm attempting to figure out how to get at this data. The one file format that seems to store all information that we need in one place is the GenBank (GBK) format:</p>
<ul><li>a gene ID</li><li>taxonomic classification of the organism from which the gene came</li><li>start and stop positions for each exon</li><li>the translated amino acid sequence</li></ul>
<p>It seems that in one shape or another, these GenBank format files are available from <a href="ftp://ftp.ncbi.nih.gov/genomes/">NCBI's FTP site</a>. While the GBK files for the prokaryotic genomes are relatively easy to get in one fell swoop at <a href="ftp://ftp.ncbi.nih.gov/genomes/Bacteria/all.gbk.tar.gz">ftp://ftp.ncbi.nih.gov/genomes/Bacteria/all.gbk.tar.gz</a>. For good ol' eukaryotic genomes, however, the data is all over the place. Sometimes it's <a href="ftp://ftp.ncbi.nih.gov/genomes/Apis_mellifera/">stored as gzipped files in CHR folders</a>, while <a href="ftp://ftp.ncbi.nih.gov/genomes/Saccharomyces_cerevisiae/CHR_II/">other times</a>, the files aren't compressed, and still other times, <a href="ftp://ftp.ncbi.nih.gov/genomes/Fungi/">the directory is really just a container</a> for directories that have the genome data. In short, it's a mess, especially when we consider we want to automate the retrieval of this data, not to mention want to update it periodically, should NCBI deposit new data.</p>
<p>There's also the dilemma of not actually needing most of the data (the genome sequence) contained in the GBK files—we just need the sequence covering start to stop for translation, including intronic sequence for the mRNA. I can write a hack of a Python script to trudge through the FTP directories and yank any GBK (compressed or otherwise) to local disk, but it seems like a big waste of bandwidth and local disk space. It seems like there must be better ways [Doesn't it always?], but I don't have the knowledge of NCBI's services to identify what these might be. If you have any ideas, please share! Meanwhile, I think I'll try contacting NCBI and see if they might point me in the right direction. I'll report back on what we decide to use, which could be my FTP hack given our limited time for this project.</p>
<p><span style="font-weight: bold;">Update:</span> I've received suggestions on <a href="http://friendfeed.com/e/f8eb0f8d-b515-3763-61b4-4194c3d52a53/This-is-a-stick-up-Give-me-all-your/">the FriendFeed entry for this blog post</a> worth checking out.</p>Anonymoushttp://www.blogger.com/profile/01078483442220289712noreply@blogger.com2tag:blogger.com,1999:blog-6363246379112249261.post-70197657092374177802009-01-25T23:02:00.003-05:002011-06-25T20:27:29.168-04:00FriendFeed PyAPI, or "What I did over winter break"<p>In mid-December 2008 I made the decision to go dark and execute on a solid coding project I could sink my teeth into. On January 13, 2009, I <a href="http://groups.google.com/group/friendfeed-api/browse_thread/thread/fd7538e554649233">emerged</a> with the fruits of a lot of labor of the fingertips: a fully fledged Python interface library to the FriendFeed API, suitably named <a href="https://launchpad.net/friendfeed-pyapi">FriendFeed PyAPI</a>.<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.flickr.com/photos/30589354@N03/3227733038/" title="FriendFeed Python Powered by gotgenes, on Flickr"><img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer; width: 350px; height: 51px;" src="http://farm4.static.flickr.com/3408/3227733038_1cb3d21de5.jpg" alt="FriendFeed Python Powered" width="500" height="73" /></a></p>
<p>This library began from the original Python code available from the <a href="http://code.google.com/p/friendfeed-api/">FriendFeed Google Code repository</a>. This library provided a great basis, as it showed me how to implement method calls to the FriendFeed API, as well as contained code necessary for authentication which I wouldn't possibly have known how to write. The original library returned native Python structures of dictionaries, lists, strings, integers, and the like, parsed out using one of several available JSON parsing libraries which may be available on the systems. While this worked well enough, I saw a chance to really improve the library by having it work with and return full fledged data structures to represent the various FriendFeed entities, including users, rooms, entries, comments, and media.
<a href="http://techblog.ironfroggy.com/">
Calvin Spealman</a>, a.k.a. <a href="http://twitter.com/ironfroggy">ironfroggy</a>, asked me two shrewd questions: 1) Wasn't I just creating an <a href="http://en.wikipedia.org/wiki/Object-relational_mapping">ORM [object-relational mapping]</a>? 2) Why would I do that? The answer to 1) was "Yes". My answer to 2) was, essentially, "Because I want to." Calvin understood what I now know about undertaking the process: it takes a lot of time doing grunt work coding to create an ORM. I had experience using the object-oriented <a href="http://code.google.com/p/python-twitter/">Python interface to the Twitter API</a> for <a href="https://launchpad.net/emptytwits">another project</a> for <a href="http://friendfeed.com/berci">Bertalan Meskó</a>, and I really enjoyed the "feel" of that library, and so I made it my goal to bring the same kind of feel to the FriendFeed library. The result was an expansion and refactoring of the original library of 812 lines of code to nearly 4,000 lines, 45 unit tests, 8 entity classes, about a dozen exceptions, and support for nearly all the API calls available.</p>
<p>I think the real joy for me came from creating methods to parse the JSON structures recursively and instantiate appropriate objects at each depth. These objects are then appropriately set as attributes of their parent objects (that is, the objects they "belong to"). All of this is done quite simply with a mapping scheme of entity names to methods (e.g. mapping the key <tt>'users'</tt> to the method <tt>_parse_users</tt>), and it feels quite elegant having it all work together, calling the appropriate parsing method for each structure, and returning beautiful little self-documented class instances. Witnessing it work in concert for the first time was definitely a "blinking LED moment," as my friend <a href="http://notesfromthelifeboat.com/">Ian Firkin</a> would say.</p>
<p>Perhaps the most important lesson came not from the specific technical hurdles I made my way through, but from the personal insight that I absolutely <span style="font-weight: bold;">love</span> programming. I love writing code; I love to talk about writing code; and I really love interacting with other developers. Over the course of the couple of weeks, I consulted <a href="http://stackoverflow.com/questions/401215/how-to-limit-rate-of-requests-to-web-services-in-python">Stack Overflow</a>, hit up <a href="http://www.python.org/community/irc/"><span style="font-family:courier new;">#python</span></a> on IRC, and had direct email exchanges with <a href="http://www.benjamingolub.com/">Ben Golub</a> at FriendFeed (who, by the way, is an absolutely stand-up developer and a fantastic representative for the young service). I have a genuine sense of satisfaction from the code and documentation I produced for the project, and that feeling makes for a happier life more than any other currency (except, <span style="text-decoration: line-through;">possibly,</span> beer).</p>
<p>So what now? Well, I released FriendFeed PyAPI under the same <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License (Version 2)</a> that FriendFeed released the original library under. This means you may fork it, play with it, and modify it to your heart's content, and if you care to, let me know what improvements you've made so I can merge them back into the trunk branch. (Of course, you may also keep any and all modifications to yourself, in your quest for <a href="http://www.youtube.com/watch?v=RcisPdJVNl8">world domination</a>, though you'll still have to attribute FriendFeed and me as taking a part in your <a href="http://www.youtube.com/watch?v=cmCKJi3CKGE">doomsday device</a>.) [Edit: On second thought, please don't attribute me in those events.] I also have a <a href="http://bazaar.launchpad.net/%7Echris.lasher/friendfeed-pyapi/trunk/annotate/head%3A/TODO">list of future directions</a>, and a few ideas of my own, including the one that actually spurned this spurt of code-writing, that I look forward to releasing upon FriendFeeders. So go out and use it! <a href="https://launchpad.net/friendfeed-pyapi/+addquestion">Ask questions</a> about it! Most importantly, please <a href="https://launchpad.net/friendfeed-pyapi/+filebug">report bugs</a>!</p>Anonymoushttp://www.blogger.com/profile/01078483442220289712noreply@blogger.com2tag:blogger.com,1999:blog-6363246379112249261.post-86357989882660854282009-01-25T18:07:00.005-05:002011-06-25T20:30:04.763-04:00Class attributes and scoping in Python, Part 1<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.flickr.com/photos/yorkjason/1344842047/" title="Object and Attribute by Napalm filled tires, on Flickr"><img style="margin: 0pt 10pt 10px 0px; float: left; cursor: pointer; width: 250px; height: 166px;" src="http://farm2.static.flickr.com/1050/1344842047_f461f3aca3.jpg" width="500" height="333" alt="Object and Attribute" /></a>
<p>This latest post comes courtesy of <a href="http://harijay.wordpress.com/">Hari Jayaram</a>, one of those people who's on my "I'd like to meet" list. Hari asks, [paraphrased] "<a href="http://www.bioscreencastwiki.com/Python_Variable_scope_gymnastics">Does Python treat class variables as having an instance scope, while at the same time treat class lists and class dictionaries as having class scope?</a>"</p>
<p>Let's use a simple example to illustrate some gotchas with regards to class and instance attributes. We'll begin by coding up a simple class with a couple of reporter functions to help us out later on; right now I'd just like to draw your attention to class <tt>Foo</tt>'s two attributes of interest: <tt>class_attr</tt> which shall represent our class attribute, and <tt>instance_attr</tt> which—you guessed it—represents our instance attribute.</p>
<pre class="brush: python">#!/usr/bin/env python
# -*- coding: UTF-8 -*-
import pprint
class Foo(object):
class_attr = 0
def __init__(self, item):
self.instance_attr = item
def report(self):
print "My 'class_attr' is: %s" % self.class_attr
print "My '__class__.class_attr' is: %s" % \
self.__class__.class_attr
print "My 'instance_attr' is: %s" % self.instance_attr
def print_self_dict(self):
pprint.pprint(self.__dict__)
def change_class_attr(self, item):
self.__class__.class_attr = item
</pre>
<p>Alright, let's throw this puppy into the interactive interpreter and play with it a bit. We'll start off by creating two instances and checking them out.</p>
<pre class="brush: python">>>> from foo import Foo
>>> a = foo.Foo('a')
>>> b = foo.Foo('b')
>>> a.report()
My 'class_attr' is: 0
My '__class__.class_attr' is: 0
My 'instance_attr' is: a
>>> b.report()
My 'class_attr' is: 0
My '__class__.class_attr' is: 0
My 'instance_attr' is: b
</pre>
<p>So right now we see that <tt>a</tt> and <tt>b</tt> share the same class attribute value of <tt>0</tt> but different instance attribute values of <tt>'a'</tt> and <tt>'b'</tt>, respectively. Now, let's attempt change the class variable:</p>
<pre class="brush: python">>>> a.class_attr = 1
>>> print a.class_attr
1
>>> print b.class_attr
0
</pre>
<p>Wait, <tt>a</tt> has our expected value for the class attribute, but instance <tt>b</tt> doesn't. That doesn't make sense; it's a <span style="font-style: italic;">class</span> attribute after all! Let's take a closer look at the internals, though:</p>
<pre class="brush: python">>>> a.report()
My 'class_attr' is: 1
My '__class__.class_attr' is: 0
My 'instance_attr' is: a
>>> b.report()
My 'class_attr' is: 0
My '__class__.class_attr' is: 0
My 'instance_attr' is: b
</pre>
<p>Notice the discrepancy between the reported values for <tt>self.class_attr</tt> and <tt>self.__class__.class_attr</tt> for <tt>a</tt>? Huh. It looks as if Python actually made the assignment of the new value to an <span style="font-style: italic;">instance</span> variable of the name <tt>class_attr</tt> rather than assign the value to the Foo class's <tt>class_attr</tt>. We can take a look at the instance and class dictionaries to help extricate this. First, let's compare the internal dictionaries of <tt>a</tt> and <tt>b</tt>.</p>
<pre class="brush: python">>>> a.print_self_dict()
{'class_attr': 1, 'instance_attr': 'a'}
>>> b.print_self_dict()
{'instance_attr': 'b'}
</pre>
<p>Ha! Python, we've found you out! We now can see, indeed, Python made the new value assignment to a brand new instance variable in <tt>a</tt> called (deceptively, in our deceptive case) <tt>class_attr</tt>.</p>
<p>Now let's explore how to actually convince Python to do what we <span style="font-style: italic;">meant</span> to do: reassign the class variable. Let's get a clean slate.</p>
<pre class="brush: python">>>> a = foo.Foo('a')
>>> b = foo.Foo('b')
>>> a.report()
My 'class_attr' is: 0
My '__class__.class_attr' is: 0
My 'instance_attr' is: a
>>> b.report()
My 'class_attr' is: 0
My '__class__.class_attr' is: 0
My 'instance_attr' is: b
</pre>
<p>One generic means by which we can reassign the class variable is to directly assign it via the <span style="font-style: italic;">class</span>, rather via an instance of the class.</p>
<pre class="brush: python">>>> Foo.class_attr = 1
>>> a.report()
My 'class_attr' is: 1
My '__class__.class_attr' is: 1
My 'instance_attr' is: a
>>> b.report()
My 'class_attr' is: 1
My '__class__.class_attr' is: 1
My 'instance_attr' is: b
>>> a.print_self_dict()
{'instance_attr': 'a'}
>>> b.print_self_dict()
{'instance_attr': 'b'}
</pre>
<p>That worked a treat. But often in production code, we don't want to tie the fate of a class variable assignment to a hard-coded class name somewhere in some file, soon to break when we refactor our code and give the class a new name. This is where using the special variable <tt>__class__</tt> comes in handy. Take another look at the method <tt>change_class_attr()</tt>.</p>
<pre class="brush: python"> def change_class_attr(self, item):
self.__class__.class_attr = item</pre>
<p>This uses the instance's inherent knowledge of what class it belongs to (accessed via <tt>__class__</tt>) to make the necessary assignment to the class variable. So, we see, this also works:</p>
<pre class="brush: python">>>> a.change_class_attr(2)
>>> a.report()
My 'class_attr' is: 2
My '__class__.class_attr' is: 2
My 'instance_attr' is: a
>>> b.report()
My 'class_attr' is: 2
My '__class__.class_attr' is: 2
My 'instance_attr' is: b
</pre>
<p>There's an important caveat here: this method, too, is fragile for sub-classes. For example, let's create a sub-class of <tt>Foo</tt> called <tt>Bar</tt>, and an instance <tt>c</tt>.</p>
<pre class="brush: python">>>> class Bar(Foo):
... pass
...
>>> c = Bar('c')
>>> c.report()
My 'class_attr' is: 2
My '__class__.class_attr' is: 2
My 'instance_attr' is: c
</pre>
Now let's observe what happens when we assign a new value to the class variable via <tt>c</tt>'s <tt>change_class_attr()</tt>.
<pre class="brush: python">>>> c.change_class_attr(3)
>>> c.report()
My 'class_attr' is: 3
My '__class__.class_attr' is: 3
My 'instance_attr' is: c
</pre>
<p>All's well, but notice this only affected the <tt>Bar</tt> class's <tt>class_attr</tt>, <span style="font-style: italic;">not</span> the <tt>Foo</tt> class's:</p>
<pre class="brush: python">>>> a.report()
My 'class_attr' is: 2
My '__class__.class_attr' is: 2
My 'instance_attr' is: a
>>> print Foo.class_attr
2
</pre>
<p>Failing to make note of this can come back to bite Python programmers in the tail. For example, you may use a class attribute to keep track of the number of instances of that class. If you would like to keep track of compatible sub-class instances, too, however, the <tt>__class__</tt> trick will prove insufficient; a hard-coded class name would prove more suitable. Use this knowledge to make the right decision for your particular scenario.</p>
<p>In the next part, I'll be covering an even more interesting scoping question dealing with lists and other mutables as class variables.</p>Anonymoushttp://www.blogger.com/profile/01078483442220289712noreply@blogger.com8tag:blogger.com,1999:blog-6363246379112249261.post-57048605185954123042009-01-23T02:22:00.011-05:002011-06-25T20:31:11.996-04:00Tab-completion and history in the Python interpreter<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.flickr.com/photos/melyviz/301735131/" title="The Interpreter by melyviz, on Flickr"><img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer; width: 250px; height: 166px;" src="http://farm1.static.flickr.com/103/301735131_d27c49e07e.jpg" alt="The Interpreter" height="333" width="500" /></a>
<p>I usually use <a href="http://ipython.scipy.org/">IPython</a> as my interactive Python interpreter, but it has problems with Unicode decoding which can have detrimental effects for times when I need to deal with Unicode (such as when I'm working with <a href="https://launchpad.net/friendfeed-pyapi/">FriendFeed PyAPI</a>). When complaining about this on <span style="font-family:courier new;">#python</span>, one of the user told me I should use the standard Python interpreter anyway. When I told him I did not use the standard interpreter because I loved the convenience of tab-completion in the IPython shell, he informed me that, indeed, the standard interactive interpreter can do auto-complete.</p>
<p>After some Googling, I came upon <a href="http://blog.venthur.de/2008/07/06/tab-completion-in-pythons-interactive-mode/">this blog post</a>. I wound up using a modified solution posted in the comments. Here's my <tt>.pythonrc</tt> file:</p>
<pre class="brush: python">import atexit
import os.path
try:
import readline
except ImportError:
pass
else:
import rlcompleter
class IrlCompleter(rlcompleter.Completer):
"""
This class enables a "tab" insertion if there's no text for
completion.
The default "tab" is four spaces. You can initialize with '\t' as
the tab if you wish to use a genuine tab.
"""
def __init__(self, tab=' '):
self.tab = tab
rlcompleter.Completer.__init__(self)
def complete(self, text, state):
if text == '':
readline.insert_text(self.tab)
return None
else:
return rlcompleter.Completer.complete(self,text,state)
#you could change this line to bind another key instead tab.
readline.parse_and_bind('tab: complete')
readline.set_completer(IrlCompleter().complete)
# Restore our command-line history, and save it when Python exits.
history_path = os.path.expanduser('~/.pyhistory')
if os.path.isfile(history_path):
readline.read_history_file(history_path)
atexit.register(lambda x=history_path: readline.write_history_file(x))
</pre>
<p>I then added the following line to my <tt>.bashrc</tt>:</p>
<pre class="brush: bash">export PYTHONSTARTUP="$HOME/.pythonrc"</pre>
<p>Now I can remain a happy camper using the native interactive interpreter.</p>
<p><span style="font-weight: bold;">Update (2008-1-25):</span> Thanks to Bob Erb's comments, I corrected some poor indentation (whoops!) and also added the final lines to remove the <tt>atexit</tt> and <tt>os.path</tt> modules from the main namespace.</p>
<p><span style="font-weight: bold;">Update (2009-4-18):</span> I removed the deletion of <tt>atexit</tt> and <tt>os.path</tt> from the main namespace. That seemed to wreck any script that needed either of those; quite a few scripts rely on os.path, in particular.</p>Anonymoushttp://www.blogger.com/profile/01078483442220289712noreply@blogger.com4tag:blogger.com,1999:blog-6363246379112249261.post-83263721487697820102008-12-18T16:50:00.010-05:002011-06-25T20:35:54.512-04:00Robust imports in Python, guaranteed fresh: how to import code for testing<p><span style="font-weight:bold;">UPDATE 2010-01-19:</span> As captnswing pointed out an alternative, and I should say more commonly used method, is to simply put the following before your import statement for your packages or modules, assuming you keep your tests in a subdirectory of your code.</p>
<pre class="brush: python">
import os.path
import sys
sys.path.insert(0, os.path.join(os.path.dirname(__file__), os.path.pardir))
</pre>
<hr />
<p><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.flickr.com/photos/burnblue/308441464/" title="Beer Tasting by BURИBLUE, on Flickr"><img style="margin: 0pt 10pt 10px 0px; float: left; cursor: pointer; width: 240px; height: 87px;" src="http://farm1.static.flickr.com/111/308441464_c5d9def328_m.jpg" width="240" height="87" alt="Beer Tasting" /></a>Anyone who knows me knows I like unit tests. I mean, I <span style="font-style: italic;">really</span> like unit tests. Like, if Mr. Software Engineering were to offer to betroth me to one of his daughters, I would ask him to betroth me to Miss Unit Test.</p>
<p>One thing that comes up when preparing tests in Python is, "Where the hell do I put them?" To this, my first answer is, "If you're willing and diligent enough to write them, you can put them damn well anywhere you please!" If that answer doesn't satisfy you, though, that's good, you're not alone. Python programmers have raised this topic on several forums, including recently on the <a href="http://lists.idyll.org/pipermail/testing-in-python/2008-November/001073.html">Testing in Python mailing list</a> and on <a href="http://stackoverflow.com/questions/61151/where-do-the-python-unit-tests-go">Stack Overflow</a>.</p>
<p>I'm a fan of the following method, which seems to have taken dominance in the Python community. It's based around the following directory structure:</p>
<pre>
rootdir/
rootdir/mymodule.py
rootdir/tests/
rootdir/tests/mymodule_tests.py
</pre>
<p>We have a directory containing our module of interest, <span style="font-family:courier new;">mymodule.py</span>, and we have a module, <span style="font-family:courier new;">mymodule_tests.py</span>, containing our unit tests for <span style="font-family:courier new;">mymodule.py</span>. We create a sudbirectory, <span style="font-family:courier new;">tests/</span>, under the root directory, <span style="font-family:courier new;">rootdir/</span>, of the project, and we place our mymodule_tests.py under this directory so that its path is <span style="font-family:courier new;">rootdir/tests/mymodule_tests.py</span>.</p>
<p>We've got to import the module we want to test into the module containing the tests for it. The import statement works for all packages/modules currently in the import path, found in the list <span style="font-family:courier new;">sys.path</span>. Since the current directory, '<span style="font-family:courier new;">.</span>', is in <span style="font-family:courier new;">sys.path</span> by default, we can easily import any packages/modules on the same level as our importing module. This would be in the form of a simple import statement of</p>
<pre class="brush: python">
import mymodule
</pre>
<p>For the typical testing layout, though, this won't suffice. We'll get a big fat <span style="font-family:courier new;">ImportError</span>. This is because the path of mymodule.py is in <span style="font-family:courier new;">rootdir/</span>, above our testing module's<span style="font-family:courier new;"> rootdir/tests/</span> path. The next logical step, then, is to place <span style="font-family:courier new;">rootdir/</span> in sys.path for <span style="font-family:courier new;">mymodule_tests.py</span> to access <span style="font-family:courier new;">mymodule.py</span>. The initial thought for doing this is to add the directory above to the sys.path using relative path.</p>
<pre class="brush: python">
#!/usr/bin/env python
import os
import sys
sys.path.insert(0, os.pardir)
</pre>
<p>Unfortunately, this is fragile. If we run <span style="font-family:courier new;">mymodule_tests.py</span> from outside its own directory, this will break the path. Take the following script as an example:</p>
<pre class="brush: python">
#!/usr/bin/env python
# parpath.py: print the parent path
import os
print "parent directory:", os.path.abspath(os.pardir)
</pre>
<p>I place this script in the path of
<span style="font-family:courier new;">/home/chris/development/playground/</span>, and then run it from this directory</p>
<pre>
[chris]─[@feathers]─[2495]─[15:35]──[~/development/playground]
$ python parpath.py
parent directory: /home/chris/development
</pre>
<p>When I run the script from the parent directory, however, my results differ.</p>
<pre>
[chris]─[@feathers]─[2496]─[15:36]──[~/development/playground]
$ cd ..
[chris]─[@feathers]─[2497]─[15:36]──[~/development]
$ python python/parpath.py
parent directory: /home/chris
</pre>
<p>In the words of Austin Powers, "<a href="http://www.youtube.com/watch?v=6zIYvBY2DzY">That's not right.</a>" Now instead of getting the directory I wanted (<span style="font-family:courier new;">/home/chris/development/playground</span>), I get the one above it (<span style="font-family:courier new;">/home/chris/development</span>). This is because relative paths is sys.path are relative to where you <span style="font-style: italic;">executed</span> the script, not relative to where the script exists. Phooey!</p>
<p>I used to just ignore this fragility and be very careful about running tests from within the same directory as the test modules. However, last night I came across a robust solution by way of some <a href="http://www.google.com/codesearch">Google Code Search</a> Fu—specifically, while browsing <a href="http://www.google.com/codesearch?hl=en&q=show:duoXlmF7OnY:zZzm1w1OJ9E:4-jfwWAun48&sa=N&ct=rd&cs_p=http://freshmeat.net/redir/moin/6595/url_tgz/moin-1.5.7.tar.gz&cs_f=moin-1.5.8/tests/maketestwiki.py">test code for MoinMoin</a>. It turns out the solution is to use a method of the following:</p>
<pre class="brush: python">
path_of_exec = os.path.dirname(sys.argv[0])
parpath = os.path.join(path_of_exec, os.pardir)
sys.path.insert(0, os.path.abspath(parpath))
</pre>
<p>If we take a look at the first line, we see that it's capturing the first argument to the command line, and using that to construct a robust path that understands where the actual module is. The very first argument in <span style="font-family:courier new;">sys.argv</span> is always what immediately follows <span style="font-family:courier new;">python</span> in the command line (or if executing directly by <span style="font-family:courier new;">./</span>) In our examples, these would by <span style="font-family:courier new;">path.py</span> and <span style="font-family:courier new;">playground/path.py</span>, respectively. Then, running <span style="font-family: courier new;">os.path.dirname</span> on these, we get the results of '' and '<span style="font-family:courier new;">playground</span>', respectively. By joining these to the parent directory, we get the desired effect.</p>
<pre class="brush: python">
#!/usr/bin/env python
# parpath.py
import os
import sys
print "parent path:", os.path.abspath(os.pardir)
path_of_exec = os.path.dirname(sys.argv[0])
print "execution path:", path_of_exec
parpath = os.path.abspath(os.path.join(path_of_exec, os.pardir))
print "true parent path:", parpath
</pre>
This gives us the following results:
<pre>
[chris]─[@feathers]─[2467]─[16:32]──[~/development/playground]
$ python parpath.py
parent path: /home/chris/development
execution path:
true parent path: /home/chris/development
[chris]─[@feathers]─[2467]─[16:32]──[~/development/playground]
$ cd ..
[chris]─[@feathers]─[2467]─[16:33]──[~/development]
$ python playground/parpath.py
parent path: /home/chris
execution path: playground
true parent path: /home/chris/development
</pre>
<p>Now we're cooking with the good sauce! Ultimately, you can create a shortened version which looks similar to the one from MoinMoin:</p>
<pre class="brush: python">
#!/usr/bin/env python
import os
import sys
parpath = os.path.join(os.path.dirname(sys.argv[0]), os.pardir)
sys.path.insert(0, os.path.abspath(parpath))
</pre>
<p>So now you, too, can enjoy a <a href="http://www.guinness.com/">fine import</a> from the comfort of your own <span style="font-family:courier new;">~</span>, or anywhere else.</p>Anonymoushttp://www.blogger.com/profile/01078483442220289712noreply@blogger.com1tag:blogger.com,1999:blog-6363246379112249261.post-27483290319270968552008-12-11T16:40:00.002-05:002011-06-25T20:36:26.563-04:00Today, I quit<p><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.flickr.com/photos/erikogan/111394461/" title="Dip by erikogan, on Flickr"><img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer; width: 166px; height: 250px;" src="http://farm1.static.flickr.com/37/111394461_1335075839.jpg" alt="Dip" width="333" height="500" /></a>Today, I quit.</p>
<p>I had discussions with my advisor, the head of our Ph.D. program, a distinguished, experienced, disinterested professor, and my closest friends and colleagues. I re-read <a href="http://www.squidoo.com/theDipBook">The Dip</a>. I mulled over my thoughts. I made my decision. I parted from my research group.</p>
<p>I quit because I understood I have to change groups to get my Ph.D. The current situation did not work. It was a Cliff. I could not change the situation; I had to change situations. I saw the choice I had to make: I could squander time and energy—mine, my advisor's, my colleagues', the taxpayers', the world's—until I fell off a Cliff, or I could quit, and find a Dip where I will excel and flourish.</p>
<p>Today I quit.</p>
<p><span style="font-style: italic;">I dedicate this post to the patience, understanding, advice, and aid of those who helped me make this decision. You have my deepest gratitude.</span></p>Anonymoushttp://www.blogger.com/profile/01078483442220289712noreply@blogger.com6tag:blogger.com,1999:blog-6363246379112249261.post-2568346735069504362008-12-10T23:29:00.014-05:002011-06-25T20:36:53.488-04:00Stack Overflow: What's in it for the programmer?<p><a href="http://taxonomy.zoology.gla.ac.uk/rod/rod.html">Roderic Page</a> made a <a href="http://friendfeed.com/e/31c37fc1-9077-1153-2397-b19392ad5c38/Stack-Overflow/">recent post to FriendFeed</a> about the website <a href="http://stackoverflow.com/">Stack Overflow</a> that set off the little hamster-powered mechanical wheels within my brain on a question I have had since <a href="http://twitter.com/gotgenes/status/1009061344">I first encountered the site</a>: Why would a programmer expend the time and energy to answer questions there? What's in it for the programmer?</p>
<p><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.flickr.com/photos/nnova/2505891967/" title="stack overflow by nicolasnova, on Flickr"><img style="margin: 0pt 10pt 10px 0px; float: left; cursor: pointer; width: 200px; height: 150px;" src="http://farm3.static.flickr.com/2157/2505891967_e6b8af2e01.jpg" alt="stack overflow" /></a>Stack Overflow, to me, is the latest in my encounters with developer help forums, others including the the <a href="http://forums.somethingawful.com/forumdisplay.php?forumid=202">Cavern of COBOL</a> on the Something Awful forums, the <a href="http://mail.python.org/mailman/listinfo/tutor">Python Tutor malining list</a>, and particularly on comp.lang newsgroups on Usenet (where I've received some of the greatest answers to my questions). That's not to mention all the channels I haunt on Freenode. Up until Roderic's post, however, I hadn't really considered how little I thought of the economics behind answering someone else's programming question.</p>
<p>Stack Overflow offers one key feature the other help forums lack: reputation points—Slashdot karma for code monkeys. Before the days of Stack Overflow, you had to lurk for a while to figure out who the <a href="http://holdenweb.com/">Steve Holden</a>s and <a href="http://www.freenetpages.co.uk/hp/alan.gauld/">Alan Gauld</a>s were, or to know you should be excited that the <a href="http://effbot.org/">effbot</a> answered <a href="http://groups.google.com/group/comp.lang.python/browse_thread/thread/cf9949fce8d51e7e/b1a9ed078384496c">one of your posts</a>. You built credibility slowly by answering posts astutely and asking really interesting questions, yet, your credibility really only extended to others who made the time to be "in the know".</p>
<p>The Stack Overflow model brings instant recognition of credibility by someone new to the place. This provides tangible incentive to stick with the community. You still have to pay your dues to get your credit: ask smart questions, write good answers. Now, though, you get to carry those contributions with you as a scout carries her sash of merit badges. Now the newbie can see your sash of merit badges and compare them to everyone else's sashes, and make valuable decisions based on social status that would have previously only been possible after months of lurking, which can save a lot of time for those who only scan answers.</p>
<p>Ultimately, though, I ask the question, <a href="http://www.youtube.com/watch?v=Ug75diEyiA0">"Where's the beef?"</a> If you're a programmer, shouldn't you be... well... programming? If you have your own work to get done, why help do someone else's, for no pay? Is the other person's problem more intellectually stimulating than your own? If so, shouldn't you <a href="http://www.squidoo.com/thedipbook">quit your job</a> and spend the time finding yourself a more challenging one?</p>
<p>If I was hiring a programmer, found a potential hire's profile on Stack Overflow, and discovered they accrued a lot of points, I'd have two minds about it. On the upside, this programmer knows what she's talking about enough to convince other programmers she knows what she's talking about; on the downside this programmer spent a tremendous amount of time doing work that's not her own. Now I don't hire programmers, and it's not clear I ever will, but as someone who would like to be hired for programming, I have these concerns on my mind.</p>
<p>It's worthwhile to compare Stack Overflow points to <a href="https://launchpad.net/">Launchpad</a> or <a href="http://github.com/">GitHub</a> points. On Launchpad or GitHub, a programmer gains points by submitting patches, doing bugfixes, and making commits to projects. On the surface, I feel like these are two different point systems, where the Launchpad/GitHub points actually mean more, and would be seen as more productive. Under re-examination, though, I don't feel confident I can defend contributions on these social developments sites from the same critical questions I posed above on Stack Overflow and other programmer forums.</p>
<p>Supposing your job <span style="font-style: italic;">is</span> to work on a piece of software tracked by Launchpad or GitHub, then all your points really indicate your productivity to a manager or potential employer. In the cases where your work is hobbyist in nature, then I think one could make the same argument for concern that I made for Stack Overflow.</p>
<p>I'll put out a few caveats to you lovely readers here: I consider myself nowhere near the paragon of the focused worker, and in fact, staying on task is one of my greatest shortcomings, and the one I spend the most time working on. (Exhibit A: this blog post.) Also, I acknowledge that there is a certain indescribable joy in doing the act of community service: providing aid to the cost of yourself for the benefit of the receiver. And sometimes, you just have an itch to scratch. I like people who do community service, and I was raised to think it's a Good Thing.</p>
<p>I will continue to think on programmer forums similar to economists who wonder about free open source software and contributions to Wikipedia. In the meantime, well, I'll probably <a href="http://stackoverflow.com/questions/358486/why-do-you-post-to-stack-overflow">pose this as a question to Stack Overflow</a>.</p>
<p>Update: Just as an aside, the user with the highest reputation on Stack Overflow is also a <a href="http://stackoverflow.com/users/1968/konrad-rudolph">bioinformatics grad student</a>.</p>Anonymoushttp://www.blogger.com/profile/01078483442220289712noreply@blogger.com1tag:blogger.com,1999:blog-6363246379112249261.post-46281846587261478952008-12-10T16:42:00.005-05:002011-06-25T20:37:29.061-04:00How tags compress semantics<p>I just had a simple, yet (personally) powerful revelation—a moment of grokness, if you will. While searching through <a href="http://delicious.com/gotgenes">my Delicious account</a> for a bookmark to a <a href="http://www.ted.com/">TED talk</a> to link to in another blog post, I came face to face with a predicament that made me really stop and think.</p>
<p>I began my search by using the tag "ted", with which I've tagged all TED talks I've bookmarked. I have 78 TED talks bookmarked. The bookmark entries for these posts have 280 distinct tags, 1024 words total in their bookmark title fields, and 1668 words total in the comment fields. The predicament is, do I look through 78 posts to find the one of interest, or do I instead look through the 280 tags?<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.flickr.com/photos/robyn-gallagher/295194457/" title="Tag by Robyn Gallagher, on Flickr"><img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer; width: 200px; height: 150px;" src="http://farm1.static.flickr.com/107/295194457_7aefd68e60.jpg" alt="Tag" /></a></p>
<p>Or is it? If we rephrase the question, we see I'm really asking, "Can I find what I'm looking for faster using 280 words, or 2692 words?" See, those 280 tag words actually represent a <span style="font-style: italic;">compression</span> of the semantics (the meanings) of the 2692 descriptive words. I can quickly scan 280 tags to identify the closest to my concept, giving me a significantly more manageable subset of posts to scan in more detail.</p>
<p>Tags seemed very straightforward and powerful before, for example, reading <a href="http://www.shirky.com/writings/ontology_overrated.html">Clay Shirky's article on the power of tagging</a>, but it took this moment to really understand the power behind them, much like the "A ha!" moment of seeing a binary search when you've always thought of search as linear.</p>
<p>Two side notes:
<ul><li>I'd like to thank the developers of <a href="http://code.google.com/p/pydelicious/">pydelicious</a> for providing me the software to extract those statistics about my Delicious tags.</li><li>It turns out the video I was looking for had the clip of interest <a href="http://blog.ted.com/2008/11/wheres_the_gori.php">removed due to copyright permissions</a>, and so the real answer to the question was to <a href="http://letmegooglethatforyou.com/">Google it</a>. Still, it was worth it for the thought.
</li></ul></p>Anonymoushttp://www.blogger.com/profile/01078483442220289712noreply@blogger.com0tag:blogger.com,1999:blog-6363246379112249261.post-849427247688012152008-11-19T02:06:00.006-05:002014-04-25T12:41:31.845-04:00Wanted: separation of personal and professional me<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.flickr.com/photos/grimages/2816577131/" title="Split Personality by Johnny Grim, on Flickr"><img style="margin: 0pt 10pt 10px 0px; float: left; cursor: pointer; width: 240x; height: 184px;" src="http://farm4.static.flickr.com/3271/2816577131_6048d282d6_m.jpg" width="240" height="184" alt="Split Personality"></a>
<p>I subscribe to a number of online social services. I began using these services, particularly <a href="http://twitter.com/gotgenes">Twitter</a>, for personal communication with friends I knew from meatspace (i.e. face to face interaction). Something interesting happened, however, when I stumbled upon <a href="http://mndoci.com/blog/">Deepak Singh</a>'s blog and discovered he, too, was on <a href="http://twitter.com/mndoci">Twitter</a>. From there I traced through his network and became a subscriber to the tweets of a dozen of other researchers, all posting notes about research in bioinformatics and the life sciences.</p>
<p>After the Dark Age of Twitter in the summer of 2008, when the site consistently suffered downtime and slow performance, I also joined up with the <a href="http://friendfeed.com/gotgenes">FriendFeed</a> service after chatter from the bio Twitter gang about that service. Once there, I discovered a nice stream of everyone's activities, most of which I find professionally interesting and relevant.</p>
<p>In fact, to a large degree, the people that I follow on FriendFeed keep their own streams extremely professional, albeit sometimes opinionated. I want these people to follow me, too. I want to use these services to build professional contacts. I want these people to see me as a potential employee/collaborator/expert. To convince them of this, I have to stay concerned with keeping my signal to noise ratio very high, and every personal matter I share moves that ratio in the wrong direction for these people.</p>
<p>On the other hand, my friends with whom I socialize typically don't have an interest in my professional pursuits. We share interests in hobbies, film, humor, and common emotional trials and triumphs. To my friends, silencing personal interaction removes their desire to keep in touch with me via these media, which, after all, I began using because they proved effective at communicating with them.</p>
<p>This presents me with a quandary. In meatspace socializing, I can be who I need to be for each person—the student, the friend, the musician—and I do it all under one identity. Each of these is a facet of the whole that is me. In online social networks, however, I cannot do these under one identity. I cannot distinguish between "music" me, "friend" me, and "Pythonista" me.</p>
<p>So here's my call for reform. Dear Twitter, FriendFeed, and any and all social sites: Give me the ability to state the context of interest of each post. Give my subscribers the the ability to filter which content they receive from me and how much or little of it they wish to see. Let them mix, match, and mash up those subscriptions as they need to. Let me be me, and let everyone else see only the facets that interest them most.</p>Anonymoushttp://www.blogger.com/profile/01078483442220289712noreply@blogger.com1