MyLife, MyRamblings, Tech

by Maria on 28 Feb 2015 - 23:37  

The other day I was at a meetup, and the subject of allies came up. Specifically, how white males can be allies for women and minorities in tech. I was reminded of an incident that happened to me while I was in the Army, and I thought it might be a useful story to a larger audience.

Many years ago, I was a truck mechanic in the Army National Guard. By coincidence, my motor pool was losing its Motor Sergeant at the same time that our unit was being consolidated with another unit. I was next in line to be Motor Sergeant. So, I was taking over at the same time that our motor pool was more than doubling in size, and, of course, the new mechanics were all male, and did not know me. The first weekend was rough. The new guys were clearly reluctant to take orders from me, and things weren't going so well. Weirdly, the next weekend was completely different, and I couldn't figure out what had happened that I had suddenly gained their respect. I was talking to a guy in the motor pool that I had known for years, and mentioned how much better things were going, and I didn't understand what had happened. He was a big, charismatic guy. The kind of guy that people intuitively look up to and respect. He told me he had seen what was going on, and the next time the new guys were hanging out and talking, he had joined their conversation. When my name had come up, he had told them, "Nah, man, she's alright. She knows her shit, and she's cool." And that was literally all it took. One person, alert enough to see that someone was being undermined, simply because of their sex, and stepping up to defend. It would have taken me months to get to that place of respect without him, assuming I ever could have.

That respect, is given by default to members of the majority, but must be earned by someone not seen as already 'in'. There are, of course, exceptions in both directions to this, but I think it is an excellent rule of thumb. It is so much easier and effective for someone of the majority to point out this imbalance, then it is for the person being disrespected, or anyone else in the minority, to do it. This is truly one of the best examples of how allies can help.

If you see someone that you think is not being taken seriously simply because of their race, sex, orientation, or whatever, step up, and say something. A simple, "hey, let's hear her out", can do wonders in making people realize their unconscious biases are showing through. Just asking a question that shows that you are taking her seriously, can make others realize that they may have been overlooking something. 'What did you say? That sounded like a pretty good idea, can you explain it again, I'm not sure everyone heard it."

Pay attention when a woman/minority says something. How are people reacting? Listen, observe, speak up. This is stuff that allies have to look for, because it is very easy to not see it, if you are not in the minority, and especially if you also feel like you are fighting to have your voice heard. (Cause, yeah, life is hard for those in the majority too. Word.) Members of the majority can shout down others, and generally lose no respect from the majority because of it. Women and minorities can lose respect, simply by fighting to have their voice heard. Allies amplifying their voices is absolutely critical to getting more women and minorities to stay in tech.

Some humor, which somehow feels appropriate. XKCD awesomeness.

Allies ~ Comments: 0

Add Comment

Busy, Shmizzy


MyLife, MyRamblings, Tech, Code

by Maria on 21 Oct 2014 - 06:43  

So, last post, I had been told by someone at a user group that I could not become a great programmer working by myself. I really love my job, so I set out to find a way to do exactly that.

As I thought about my predicament, I thought, sheesh there must be hundreds of people just like me at the university in exactly the same pickle, all of us working mostly by ourselves in research labs all over campus, and probably a good percentage of us self-taught. I started poking around the UW website, and was surprised to find no sort of network of developers. So, I started one. In May of this year, I began tying to figure out how to track down fellow developers at the UW, and it turns out this is no easy task. But, as of October there are 87 subscribers, so I'm making progress. If you know any software developers at the UW, please send them to this site:

to subscribe to my mailing list.

We have started having regular meetings as well. It has been a lot of fun. We have been looking at code, and talking about research and software development. I started my list at an opportune time, because others were also feeling there was a void. There is now an organization at the UW called eScience, and they are very interested in improving coding practices in science at the UW. When they found out about our group, they volunteered to help out. Currently they help with organization and bring snacks to our meetings, total win! Additionally, as a community we are receiving many awesome opportunities. For example, in November, I and many others on the list will be attending a Software Carpentry Instructors training.

Which I am really looking forward to. Science and coding, why not do both well? Plus, we get to do this:

09:15: Teaching as a performance art (2)

So we can share the love.

I have been looking for ways for our group to meet on a regular basis to do some live coding, and I am contemplating starting a coding series of sorts. My current idea is that I'd like to take this book:

Head First Design Patterns

By Eric Freeman, Elisabeth Robson, Bert Bates, Kathy Sierra

and go through it as a group. Each time we meet we would talk about one or more patterns, and talk about how it translates into the various languages that people in the group know, and hopefully do some group or pair coding and share it.

So, if you have tried something similar, I'd love to hear how it went! Or if you have ideas of other things that have worked with your group, I'd like to hear that too. Finally, we are always looking for speakers that have experience in the juncture of code and science, especially with incorporating best practices, so feel free to drop me a line if you want to help or come talk!

In addition to this group I have formed, I have just started TAing for an Introductory Python Course, because there is no better way to really learn material, then to teach it!

I don't know if I am becoming a great programmer, but I am learning a lot. Maybe not as quickly as if I were working daily with other developers, but I get to keep my cool job, and still learn more about best practices and about coding from other developers, so I'm pretty sure this is the appropriate response:

Busy, Shmizzy ~ Comments: 0

Add Comment

So sue me, I've been busy


MyLife, MyRamblings, Tech, JobHunting

by Maria on 20 Oct 2014 - 06:56  

This is a longish rambling essay, I mean a super interesting essay, explaining why I haven't posted in so long (I've been very busy!), what I've been up to (talking to people, ye gads!), and some advice for job hunting, especially if you aren't currently worried about employment (cause job hunting sucks and you should worry, er I mean, prepare before you're out there!).

I want to tell you a story. A few years ago, I found myself pregnant fairly soon after receiving word that I would be losing my job. The only reason I mention that I was pregnant is because that meant I really couldn't spend much time job hunting before I lost my job, even though I had quite a few months warning. I would be getting decent severance, so I wasn't too panicked. Yes, the economy still sucked, but it was starting to recover, and tech jobs were the first returning. Other people staying in Seattle from my lab had managed to find new employment before my boss moved, which I also took as a good sign. The last time I had been out looking for work was during the dot com bust, and while it had taken a few months, I had more experience this time. So, a few months after my boy was born, my job went away and I received a rude awakening. The job market had changed.

I've already done a rant about tech interviews, so I won't rant about that again. And of course, the job market had changed in quite a few ways in 12 years, but one of the biggest lessons I learned was the importance of networking in today's market, and how when you find yourself unemployed, that is not the ideal time to start networking. We have all heard about how important networking is, but what does that mean? Well, I think the bottom line is, it means getting to know lots of people that are willing to recommend you for a job. Yes, someone you meet once or twice at a networking event may tell you that their company is looking for someone that has your skills, and they may even get your resume in front of the hiring manager, which is definitely a step up from applying through a website, don't get me wrong, but it is not your most likely route to a job. What you really want is for someone who knows you, your work ethic, your personality, and your skills and also know the hiring manager pretty well, to recommend you. Someone you met at a networking event a few times, just can't do that. So, the way to effectively network is to get to know as many people as well as possible. Not just any people, but people that understand the skills needed to do your job, iow people doing your job or a pretty similar one. You need to get to know them as well as possible by talking shop, and a lot of it, and not just about your favorite movie or whatever. You know you have probably reached that with someone when you would be completely comfortable recommending them for a job, so yeah, this takes a while, and if you can do joint projects, all the better. And then, when you are looking for a job, you should personally hit up every one of those connections, and ask them if they have heard of any job openings you may be interested in. And remind them you are looking on a regular basis, but somehow strike that balance of not being a pest, but keeping your name in their thoughts. I'm afraid I don't have much advice on that last one, and I suspect it depends on the person and the method as much as anything.

So now I can finish my story. The reason I had lost my job was because my boss had moved to NYC, and I was unable to move to NYC. After months of hitting total dead-ends looking for work, my old work called me back in to take down some servers that my former boss had left running. Someone new was taking over his old space, so all of his old stuff needed to be taken care of. I went in and dealt with everything, and started to walk out the door. And then I stopped and went back, because I thought, well I should probably introduce myself to the new person and let her know who I was so she could contact me if she found more stuff or had questions about anything. When I went in and introduced myself, she started asking me about what I had done for Mike, my old boss, and it turned out she was in the same boat Mike was, she had moved across the country and left her programmer behind. Neuroscience research with primates is a small field, so of course, she knew Mike, and so Mike's recommendation was valuable, and skills needed for both jobs greatly overlapped. So, when I finally found employment, it was a combination of serendipity and because of someone I knew well. Yeah, n of 1, I know, but, heh, the internet.

So, when I looked back on my 12 years of being a sysadmin and a software dev, I realized that I could pretty much count on one hand the number of other software developers I knew in the Seattle area. So, if I didn't know very many people who could recommend me, and I suck at interviews, it is no wonder I had such a hard time finding a job in the tech market! Why did I know so few other developers? This was for many reasons, but the main one was that I was in academia in a fairly secure job the whole time, and wasn't really thinking enough about life after my current gig. My ways of improving myself had always been books, mailing lists, and classes, which doesn't get you out in the community much. Plus, I had been working as both a sysadmin and a developer, but quite frankly the sysadmin required much more constant learning and tended to hog my learning time, because it involved so many different technologies and programs. I decided during my period of unemployment I wanted to focus on software development, and clearly, if I was going to continue to be a coder and be able to deal with any future unemployment or career moves, I was going to have to change how and what I was learning drastically. And so, the introvert starts the slow task of reaching out.

I started going to various meetups, and other local meetings of coders and learning more about how to become a better developer, figuring it would be good to kill two birds with one stone. And, while it has never been clear to me why I would want to kill two birds with stones, it was abundantly clear why I needed to both network and become a better developer. Since I was a self-taught developer, who had spent all my time in academia, I knew there were some holes in my skills and knowledge. In the meantime, I was really loving my new work at the University, but I was a little worried, because I was once again working by myself in a science lab with a bunch of primarily science researchers who knew a little about coding. One day at an event I was talking to people about becoming a better programmer. They were emphasizing pair coding, code reviews, working in a team, etc. and so I asked how I could improve when I worked by myself. One of the developers I was talking to told me that he didn't think anyone could become a great programmer working by themselves. Not the answer I was looking for.

Next post: How I responded

So sue me, I've been busy ~ Comments: 0

Add Comment

The Elephant in the Room


MyRamblings, MyLife, Tech, JobHunting

by Maria on 21 Sep 2013 - 06:27  

Elephant, not in a room

For about 13 years, I have worked in a neuroscience lab as a programmer and a sysadmin. The last time I was interviewing for a job, I had to convince my potential employers that I was smart, dedicated, easy to work with, and would get the job done. Fortunately, I'm pretty good at that. I knew very little about how to program, but I convinced my future boss that I had a plan, the dedication, and the smarts, to teach myself how to program, so I was hired as a programmer.

And now I have 13 years experience as a programmer and a system administrator, so theoretically, it should be easier to get a job. Unfortunately, the interview has changed. It is no longer enough to tell the interviewer what I have done, what I can do, and how I work. The amount of information about me that is available to a potential employer is way more than it was 13 years ago. But, this time around, what potential employers really want to know, is can I solve a toy programming problem, while they watch me and evaluate me (or worse, evaluate me over the phone using a shared doc), under more or less a timed condition. These are the worst possible conditions for me to perform under, and this test has nothing to do with how I will perform as an employee. Seriously. It doesn't reflect how well I program or how much I know about programming or what kind of an employee I am.

I understand that everyone gets nervous about interviews. But, clearly, some people do not become as brain dead as I do. I have been working hard to overcome this. I practice as much as I can, but the situation is frustrating. I know that if I continue to practice, I will get better at it, and eventually I will be able to finish an interview without feeling like I can no longer even recite the alphabet. (I find practice interviews help only a little bit, since most of the pressure is missing.) Eventually, with enough time and practice, I'll get lucky, and the interview will consist of questions that don't put me in panic mode. But, why should I have to do this? Why have we settled on this as the process, when it has so little to do with what candidates need to succeed at work? Interestingly, the process favors people who job hop a lot, and therefore get more practice with real interviewing. That, and showoffs.

And why are interviews a particular problem for me?

It isn't the male environment. I've been living in a mostly male world since I started playing tackle football with the neighborhood boys at age 8.

When I was in the Army, I attended a school to learn large diesel truck mechanics. At one point in the school, one of the instructors had me take over the class, because he felt I knew the material better and was a better instructor. As an undergrad, I was fine with teaching the first year physics lab series, even enjoyed it. For years, I taught weight training. Clearly, the problem isn't speaking in front of people.

It is not the white board. As someone who has been in academia for a very long time, I am very intimate with the white board, and even its predecessor, the black board. It is a very useful tool for sharing ideas with others, and I am more than happy to write code on it, if I feel like we are collaborating and/or learning, and not like I am being judged and timed.

It is not the pressure, per se. Pressure I can deal with, and can thrive in. I ran a mail server for years, and there is nothing quite like the pressure of the mail server going down while your colleagues are at an important conference and very dependent on email. This sort of pressure can energize me.

It isn't even that I don't like these sorts of coding problems, I actually do, as long as I'm doing them in my living room in my pajamas for fun.

So, what is it then? I first noticed this problem of my brain shutting down when I was an undergrad in the physics department. I knew the material. I did well on the assignments. And then, I looked at the test and my brain stopped functioning. What was different about physics and auto mechanics? Well, physics is definitely more difficult (unless you're engineering the auto, or troubleshooting the electrical system, and not just replacing the cv boot). But, that certainly can't explain all of it, because I was doing fine on assignments, and even quite a few of the tests, even though I sometimes felt like I was working at half brain capacity. So, it wasn't simply the difficulty of the material.

Instead, I think it was the structure of the testing. In the physics department, the tests were usually designed so that most people wouldn't finish the tests, and so that no one would get a perfect score. They didn't always succeed in their design plan, going in both directions. I remember tests where the mean was 25 out of 100, and I remember tests where a couple of people managed to get perfect scores. But, usually the mean was right around 50%. This was much different from the auto mechanic courses, and quite frankly, different from most of my other college subjects. And it was demoralizing and scary. And it didn't help that instructors often said things like, 'It should be obvious that...', for things I didn't find obvious at all. Fortunately, none of them were patronizing. Oh, wait. And the more I became insecure about my abilities, the more difficult the tests became. I saw this happening, but could not find a way to stop it. I nearly switched majors because of it. I feel the same kind of pressure now in interviews, but there are different factors.

1. As many geeks are, I am an introvert. I have a hard time talking to strangers, especially small talk. I've mostly gotten over this, but when combined with the other three factors, I think it still plays a role.

2. I think before I speak. So, sometimes there are those long pauses that interviewers hate, because they want to know what is going on in your brain. I try to go back, and say this is what I thought of, and why I rejected it, but it is very difficult for me to speak while I am thinking. Weird, I know. Apparently, this is a thing. And, when I realize I have been silent for a while, I get nervous about that, and we start another cycle of brain freeze.

3. I sometimes experience Impostor Syndrome. I am hyper-aware of how much I do not know. I love learning, and am constantly learning and growing, but sometimes the awareness of how much I don't know makes me feel inadequate, hence, an impostor. And admitting that I sometimes experience Impostor Syndrome makes me feel inadequate. Totally kidding, the recursion is not infinite.

4. I do not have formal CS training.

All of this means is that I become particularly terrified during interviews, but NONE of these things has ever had any bearing on my actual work. Impostor Syndrome seems to particularly affect women, and so I have to wonder if it is the confluence of some of these factors: introversion, impostor syndrome and awful tech interviews that discourage geeky women in particular from staying in the tech industry. Of course, it is more complicated than this, but I do believe the interview structure sure can't be helping the numbers of women. If I, and I am stubborn, and I love making my life as difficult as all hell, sometimes wonder if I should bail because of interviews alone, then it must also be a factor for others.

So, what can be done? What is it that employers really need to know about an applicant before they hire them?

1. Will they get the job done? If they don't know something they need to solve the problem, do they know how to figure it out in a reasonable time? Do they know how to google (seriously, this is an art), and regularly use IRC and/or mailing lists? Are they organized, and do they approach problems reasonably systematically? Do they know how to troubleshoot? Are they tenacious? Do they know when to ask a mentor/colleague for help? Are they willing to try new things?

2. Are they reasonably easy to work with?

3. Do they fit in with company culture and the particular team?

Is there anything about the current popular tech interview format that answers these questions?

My best experience with an interview was one in which the interviewer described what the group was working on, and specifically what I would be working on. We discussed ways to move forward on the current project. We discussed existing problems, my ideas on what to do, and their ideas on what to do. We discussed which technologies I had worked with before, how deep my understanding of them was, which ones were new, and how I would get up to speed on the new ones. At the end, I think we both felt pretty comfortable about what we were getting. I understand that sometimes companies don't want to divulge this much information about what they are working on. But, they can certainly say something like, on day 1 when/if you start here, you will be using technologies, A, B, and C. Which of these are you comfortable with, and which do need to get up to speed on? How would you go about getting up to speed? Which combinations of technologies have you used before, and what were the challenges in how they worked together?

Ideas on other ways to understand applicants:

In that interview, we did not sit down and look at any code, but I could imagine sitting down and looking at a bug in some code and discussing how to solve the bug. Since most of a programmers time is spent debugging, refactoring, optimizing, and testing, and often you are dealing with an existing code base, it seems that talking about refactoring and troubleshooting are way better ways to learn how a person thinks, in a way that is relevant to how they will perform on the job. And I do mean talking, not testing their coding ability. In all of the interviews I have had so far, absolutely no one has asked about troubleshooting and refactoring code. Try some pair programming. Have a candidate look at some code and describe to you what they think the code is doing. If the company is sensitive about its code base, maybe they can fork some open source code that is close to a realistic problem they might face, and the interviewer and interviewee could discuss the merits of the code, maybe even hack on it a little, on an actual computer. Yes, if you feel a deep need for the candidate to write some code, at least have them work on a computer, preferably their own, without someone looking over their shoulder. How about some pair programming or or pair troubleshooting so it feels collaborative, and more like what actually working with this person will feel like? Who says you can only discover how someone thinks by talking to them while they are thinking? Why not wait until they are done, and then you can talk about why they made the choices they did, and what other things they thought of and rejected? I have had companies ask me to write sample code, and then bring me in for an interview, and never bring up the sample code at all. That makes no sense to me. Ask them why they made the choices they made, if they have thought of any ways to improve it, etc. Try to help the candidate feel more comfortable, because that is more realistic for how they will work. For coding, stick with computers. Whiteboards are really awesome for discussing concepts, illustrating (literally) what code is doing, and designing, and can be used for these things during interviews, but despite what I said above, they kind of suck for actually writing code.

Recently I was talking to a recruiter at a large company, and he mentioned how this company was going to start offering tech interview classes. Really?!?!?! This is ridiculous. If your interview process is not screening for what you need it to screen for, and if you know there are qualified people out there, that you want to hire, and you find yourself starting to offer them training on how to get through your interview process, then it seems that it is the process that is screwed up, and not all of the qualified applicants who can't seem to jump through your hoops. Think out of the box, employers!

The bottom line is this, if I am treated as if I am an expert, and you are inviting me in to see if I can help you to solve a problem, you are going to get a much better idea of how I work and how I am to work with, than if you give me a random problem to solve and ask me to solve it on the whiteboard while you watch me and your watch.


Other people complaining about tech interviews in interesting ways:

The Elephant in the Room ~ Comments: 2

Add Comment

More Tries in Python


Code, Python, Tech

by Maria on 13 Sep 2013 - 06:40  

In the previous blog post we tried to figure out what the python code I found on Wikipedia that was supposedly returning an iterator over the items of the Trie was actually doing, and then we sort of ran with that to gain some understanding of Tries, generators, and recursion in Python. And that was great, but in the end we didn't really have anything useful. Our method returned all of the letters in our Trie, one at a time. It seems it would be more useful if it returned all of the words in our tree, so let's try doing that. Since we have a goal, let's create a test so we can tell if we have reached our goal. Here is our test code:

import unittest
from trie import Trie

class TestWords(unittest.TestCase):
    """Tests for function words"""

    def setUp(self):

        self.mytrie = Trie()

    def test_default_case(self):
        """Test words retrieves all words properly from Trie."""
        expected = ['ant','ante','antic','antsy','antse','ban','banana']
        actual = []
        for words in self.mytrie.words():
        print 'actual', actual                                                                                          
        print 'expected', expected                                                                                        

if __name__ == '__main__':

I've made a fairly complicated, but relatively small, Trie, to really give our code a run for our money. And yes, I even made up a word. And, of course, our method fails miserably. So, let's see about getting some words instead of letters.

We know that many of our words have prefixes in common, that is the point of creating a tree like this. So, the general idea is going to be to collect letters and re-use them. To begin, let's just start collecting letters, and spitting out what we have when we reach the end of a word. We can tell if we are at the end of a word by whether there is a node.value.

    def words(self, prefix = []):
        """Return an iterator over the items (words) of the 'Trie'."""
        word = False
        for char, node in self.root.iteritems():
            if node.value:
                yield ''.join(prefix)
            for i in node.words():
                yield i

And, our test fails!

actual ['ant', 'antic', 'anticsy', 'anticsye', 'anticsyee', 'anticsyeeban', 'anticsyeebanana']
expected ['ante', 'antic', 'ant', 'antsy', 'antse', 'banana', 'ban']
FAIL: test_default_case (__main__.TestWords)
Test words retrieves all words properly from Trie.
Traceback (most recent call last):
  File "", line 27, in test_default_case
AssertionError: False is not true

Ran 1 test in 0.001s

FAILED (failures=1)

I have printed out the expected and actual list of words, and we can see immediately what the problem is. We need to figure out how to delete the letters we no longer need. How do we decide when we should delete a letter? Let's take a look at our Trie:

It looks like we will need to delete letters when we come to the end of a word, but only if that letter is at the end of a branch. How will we know? And how many letters will we delete? When we get to the end of the word antsy, we will need to delete just the y, but when we get to the end of the word antse, we will need to delete both the e and the s. (We actually have no way of knowing which branch will be traversed first; since the tree is based on a dictionary, which node it traverses first is random.)

What would be really useful at this point is if we knew at each branch how many words share that prefix. We could certainly figure this out by traversing the tree once, and then using the information when we traverse the tree again to get the words. But, it seems like this might actually be useful information, in and of itself. So, maybe this is something that really should already be available. This is the advantage with playing around with your proposed data structure before production use. We can still go back and add functionality. So, let's put some new code in our add function to keep track of the number of words using a given prefix. The first two methods of our class now look like this:

class Trie:
    def __init__(self):
        self.root = defaultdict(Trie)
        self.value = None
        self.count = 0

    def add(self, s, value):
        """Add the string `s` to the                                                                          
        `Trie` and map it to the given value. Additionally, keep track of
        how many words each node is shared with.

        head, tail = s[0], s[1:]
        cur_node = self.root[head]
        cur_node.count += 1
        if not tail:
            cur_node.value = value
            return  # No further recursion                                                                    
        self.root[head].add(tail, value)

Very simple addition, which is going to make our life much easier. We may also want to create a method that returns the number of words in our tree that share a particular prefix we may be interested in, but for now, let's finish our iterator.

For every letter that we find, we now know how many words use that letter, so if we keep track of how many words we find that are using each letter, we can compare the two. When we reach the end of a word that is also the end of a limb, we need to delete at least to the last branching point, but possibly farther. We traverse A-N-T-E and we have hit an endpoint. The E only has one word associated with it, so we can delete it, but the T has 5 words associated with it, so we need to check if we have finished 5 words yet. We can't just keep one running tally of words finished, because sometimes we will be keeping track of multiple branching points. For example, we need 5 words for the T, but only 2 for the S, but can be working on completing both of those branches at the same time. Clearly, we need to keep a list of words finished for each letter. We start at the A, we have finished no words, [0], continue to the N, [0, 0], and now to the T, [0, 0, 0], but now we have finished a word, so we go back and add 1 to everything [1, 1, 1]. Let's go to the E, we have [1, 1, 1, 0], and once again we have finished a word, so we now have [2, 2, 2, 1]. We are now at the end of a limb, so we delete letters.

But, wait, how did we know we are at the end of a limb? Well, we know we are at the end of a word, since we can check to see if there is a value associated with it. And if we check the node.count at this point, we can see that this letter is associated with just one word, so we must be at the end of a limb. Great, we can delete letters. How many? We have to check. We delete the E, then check the T. The T has 5 words associated with it, but we have only 2 recorded for the T, so we can't delete it yet. We are done deleting, as no earlier letters can possibly be ready for deleting yet. Now we go to the I, but wait, our letters-visited-list is [2, 2, 2, 1]. Hmm, looks like we need to remember to delete the 1 at the same time that we delete the letter. Makes sense, as this list should correspond to our prefix list. We are at [2, 2, 2, 0] now. We go to the C, and we are at the end of another word, so the array becomes [3, 3, 3, 1, 1]. We check our node.count, and it is 1, so we are at the end of a limb. Delete letters at the end of our array until we get back to the T, which is still a 5, and we are only at 3. Great, seem to have the hang of this. But, we have now checked the T twice, and you may remember that we are not actually going to have access to that node when we backing out. You can tell this is true, because when we were getting the letters, and not deleting any of them, we did not see multiple T's. The final word was 'anticsyeebanana', so the algorithm was just tacking on more endings, and not re-visiting previous letters.

What would make more sense anyway is to keep an array of the nodes we have visited, and what their corresponding word counts should be. So, that list would now look like [5, 5, 5, 1, 1]. Now we just compare the arrays and when numbers are equal, get rid of those letters. Well, we still must only check when we are at the end of a limb. But, this seems a little over-complicated too. What if we just collected the word count array from the nodes, and subtracted 1 from the whole array whenever we hit the end of a word? Any letters currently in the array would be ones that this word would be a part of, since it is just a representation of the letters in the prefix array. Any zeros would represent the letters (and positions in the word count array) we need to delete. Let's try it.

    def words(self, path = [], prefix = []):
        """Return an iterator over the words of the 'Trie'."""
        for char, node in self.root.iteritems():

            # if there is a node.value, then we are at the end of a word and
            # we should yield the word and subtract 1 (word!) from the
            #  entire node path
            if node.value:
                yield ''.join(prefix)        
                for x,y in enumerate(path):
                        path[x] = y - 1

            # if we are at the end of a word and the count is 1,
            # we are at the end of a branch, and it is time to
            # delete letters back to the last branch we haven't
            # gone down yet.
            if node.value and node.count == 1:
                    for j in range(path.count(0)):
                        del path[-1]
                        del prefix[-1]

            for i in node.words():
                yield i

And, let's run our test.

actual ['ant', 'antic', 'antsy', 'antse', 'ante', 'ban', 'banana']
expected ['ante', 'antic', 'ant', 'antsy', 'antse', 'banana', 'ban']
Ran 1 test in 0.000s


Left the print statements so you could see the words. So, at this point, we should create a bunch more tests to check more edge cases.

Do you suppose this is what the author was trying to do with the code in the Wikipedia article? Is there any way to do this without including the node.count or checking the tree twice?

You can find my code on GitHub.

Finally, here is a nice post about how Tries are useful for word storage for computer versions of Boggle, without ever mentioning the word Boggle.

More Tries in Python ~ Comments: 0

Add Comment

Python: Recursion, Generators and Tries


Code, Python, Tech

by Maria on 05 Sep 2013 - 04:25  

I was learning about Tries, and writing some code to do fun stuff with them, and along the way I ran into some code on Wikipedia that seems faulty, and also came up with a silly way to explain recursion. So, I thought I'd share.

First the silly explanation of recursion:

Let's say you have this wooden sign, but somehow the letters were put on the sign backwards, so your 'welcome' sign says 'emoclew'. So, you put it in your magic box. The box has boxes inside boxes, each with one door leading to the next inner box, kind of like a russian doll with doors. The magic part is that the inner boxes and doors only appear when you feed in a piece of wood with at least one letter on it. The sign goes through the first door, and the first letter (e) is cut off and set aside, and the rest of the sign continues to go through doors. Each time a letter is cut off, set by the door, and another door magically appears, until there is just one letter on your piece of wood. At this point no more doors appear, and the wood is sent back through the door(s) it came in. Each time it goes out of a door, the letter from that box is put back on the piece of wood, but now to the other side of the piece of wood. So as it goes back through the doors it looks like this: 'w' -> 'we' -> 'wel' -> 'welc', etc., until it pops out of the last door, and you have your sign the way you want it. Or, you could just repaint he sign. ;-)

This is pretty much how recursion works. The important bit is that the data is processed both on the way in and on the way out, and each step does exactly the same thing. This is how that code looks:

def reverse(s):
    if s == "":
        return s
    return reverse(s[1:]) + s[0]


The first time it goes in, there is a string there, so it goes to the else statement and runs reverse('moclew') + 'e'. The 'e' just hangs around by the door waiting for the return to be completed, while the 'moclew' goes through the next door. Same thing with the next run, the 'm' hangs out by the door, while the 'oclew' goes through the next door. Once we get to the last letter, no new door appears, instead our 'w' gets kicked out. This is the 'return s' line. But, we still have to go back out the doors we went through, so at the next door, we hit our return reverse(s[1:]) + s[0]. s[1:] at this point we have 'w' ('reverse(s[1:]) from the last door plus the 'e' (s[0]) we left hanging out, so we add them together. Going back through the next door, the reverse(s[1:]) is now 'we' and we add the 'l', continuing until we once more have our welcome sign, letters correctly written.

Now things get weird (and very useful) when we get multiple inner magic boxes. Let's say instead of our single word, we have a word tree. This time we aren't reversing the letters, just collecting them to find out what all words we have in our tree.

  C R
 K   E

Wouldn't it be great, if we could find all the words at once? This is basically what recursion can do. We send our tree in to our magic box. It collects the first letter, sends the tree through the next door, gets the second letter, sends the rest of the tree through the next door, but now finds two letters where it normally finds one. So, open two doors, and send each remaining tree through its own door, simultaneously! You can see how this should be faster than a for loop that would go through each branch one at a time. Recursion is most useful when you have data in tree format, such as directory structures, classification hierarchies, organization charts, etc. Or, if you are trying to do a flood fill. For an awesome explanation of the flood fill algorithm (with cats and zombies!) see the invent with python blog. There are, of course, times when recursion is not the best solution, such as calculating the nth Fibonacci number.

Great, now we are ready to look at Tries and the actual code to do this, and along the way introduce generators.

I was trying to understand Tries (aka Prefix Trees), and started playing around with the Python code on the Wikipedia entry for Tries. (You should probably go ahead and look at the whole article if you aren't sure what Tries are.) One of the last methods for Trie class in that code is items, which claims it returns an iterator.

def items(self):
    """Return an iterator over the items of the `Trie`."""
    for char, node in self.root.iteritems():
        if node.value is None:
            yield node.items
            yield node

So first let's see if we can figure out what this code is doing. Notice the yield statements. The yield statements indicate this is a generator. A yield statement is similar to a return statement, in that it sends the contents of whatever it is yielding to the calling function, but there is an important difference. As the name implies, it is only temporarily handing control over to the calling function. The generator is an object capable of creating and returning its members one at a time, because it is able to remember where it is when control is returned to it. It's purpose in life is iteration. If you have no idea what an iterable is, there is a great post on iterables here. This all may seem a bit fuzzy, but I hope by diving into this code some more, it will become clear.

So, this code seems to be creating a generator that spits out the node items of our tree. That sounds right. We know from looking at the other methods in our class Trie, that node.value is None until we come to an end of a word. So, maybe it spits out letters until it gets to the end of the word, and then moves to the next node? How would that happen? Hmm, this isn't sounding quite right...

And, when I started playing around with this method, I found it did not return what I expected or wanted. I wrote a little code snippet to test the items method. To start with I just added 2 entirely different words to my Trie. Here is a graphical representation of my tree:

And here is my little script:

$ from trie import Trie
$ mytrie = Trie()
$ mytrie.add('terror',1)
$ mytrie.add('plant',2)
$ for letters in mytrie.items():
$    print 'letters', letters 

I thought this would give me back all of the letters in my tree, but instead I got the following:

letters <bound method Trie.items of <wiki_trie.Trie instance at 0x1101921b8>>
letters <bound method Trie.items of <wiki_trie.Trie instance at 0x11018eef0>>

Certainly not the letters I expected. So, what is going on here? Well, the first problem is that we seem to be yielding the wrong thing, if we want letters. It seems to be yielding the method itself. Well, yes, that is exactly what it is doing. What do you even do with that? Some sort of twisted recursion? Hmm, well no surprise that the char in the iterator is the letter, so let's try yielding that instead. But it turns out, that if we yield char, we only get a letter 't' and letter 'p'. Why aren't we getting all of our tree?

So, our tree is made of up to 26 nodes at each level (assuming we are just allowing the lower case alphabet). In our example, we are using 2 nodes at the top level, 't' for 'terror' and 'p' for 'plant'. So, it looks like our code is iterating through those two nodes and stopping, and now we have to figure out how to get it to iterate through each of the child nodes as well. So, how do we do that? Recursion! Remember how I mentioned that recursion is useful for tree traversal? So, what happens if we iterate over node.items()*? Let's change our method, and for now, let's just print what we get:

* Note the parentheses that were missing in the original code on Wikipedia - we are now calling the function, not yielding the name. And no, yielding the call would not have been any more useful than yielding the name, or at least I could not find a use for it...

1    def items(self):
2        """Return an iterator over the items of the `Trie`."""
3        for char, node in self.root.iteritems():
4            # node.value is none when not at end of word
5            if node.value is None:
6                yield char
7                for i in node.items():
8                    print 'i',i
9            else:
10                yield char

And, rerunning my little test script, I get:

$ python
letters p
i l
i a
i n
i t
letters t
i e
i r
i r
i o
i r

Aha! So, now we yield this, and we are in business. A quick note about yield vs. return. With yield, you are not waiting until the function exits to return your variable. Using the box analogy, you are not waiting for the block of wood to go through all its boxes and come back again. You are yeilding your variable as soon as you are done with initial computations and hit yield, but remembering where you are, because you do expect to see the block of wood again. This means that, just like above where the i prints right away, you will get your letters one at a time as they are found.

We change our code again to this:

1 def items(self):
2      """Return an iterator over the items of the `Trie`."""
3      for char, node in self.root.iteritems():
4          # node.value is none when not at end of word
5          if node.value is None:
6              yield char
7               for i in node.items():
8                   yield i
9          else:                                                                              
10                yield char

And we get our letters!

letters p
letters l
letters a
letters n
letters t
letters t
letters e
letters r
letters r
letters o
letters r

Note that Python 3.3 added the syntax yield from X, as proposed in PEP 380, which simplifies our inner loop. With it you can do this instead:

yield from node.items()

But, what is going on in this code? It was straight-forward when we had the print statement, but now we have recursion in a generator. We had to make it a yield statement, because we can't have a return statement in a generator, but that does make things weird. So, now what is actually happening here?

Let's step through our loop. In our test code, we enter our loop and call our method to get the first letter. Node.value is none, so it yields char, and yields control back to our test loop. Our test loop prints 'letters' and the letter yielded. Now we go to the next 'letters in mytrie.items()' in the test loop again. This is the point where if we hadn't have added that inner loop, we would just continue to loop through that top h-node. Instead, something happens that may be a little unexpected. When we return to our method, we return to the same place we left; we do not go back to the beginning again, and this time this has consequences. This is something that is often glossed over when generators and yield statements are introduced. It is not just the variables that are preserved, but the position in the function as well. In the box analogy, perhaps when you return to the previous room (function call), instead of magically appearing at the exit door, you have to continue down a hallway to get to the exit. And maybe you have to pull a lever or something else before you leave.

When we left our items method, we had just yielded char on line 6, so now ask if there is a node.value (yes), and we enter our 'for i in node.items()' loop (I'll just call it the i-node loop). But, wait, node.items() is calling our method again (we hit recursion!), so now we do go back to the beginning of our method, and we continue until we hit our next yield statement, yield char again. Now yield char is an 'l', since we have iterated forward one, while still in our method. But, this yield is to the recursive call, so now control is returned to our i-node loop. Our i-node loop just has another yield, so now we yield again to the test loop, and print our 'l'. Okay, let's do another loop. We make our call from the test loop, and go back to our function, but where? The last yield statement was in the i-loop, so that is where we return, but that loop is now completed, so we go back to the next to last place we yielded, right after the yield char. That is followed by the i-node loop, where we finally get to create a new generator, and advance to the next letter. Every time through, we will create a new generator, and have to back out of that many more yields. While this is happening, Python is returning to the stacks it created in previous calls to create the current stack. This may sound inefficient, but that is because we are only going down one branch at the moment. As soon as we have to back partway out, and go down a new branch, it becomes more obvious why it is good to keep the old stacks around, backing out of yield calls to figure out which stack is relevant.

It is worth noting, that if this were ordinary recursion, it would not have worked. The first loop in our function is yielding char. If it were returning char, the inner loop could not advance, because it needs to know both what the last i was, and what node.items is in order to advance. In a normal recursive call, it would not have access to this information, since these variables are not being returned.

In our example of recursion with the magic boxes, we left a variable by the door, and picked it up on our way back out. The only reason we were able to do this is because that variable was in the return statement. In a normal function, once you leave the function (even to go into the next call of the same function), you lose any other information from that function, except what is in the return statement. But, in a yield statement, you return to the same place in the function, with all of the information you had when you were in the function before. And, if there is code after the yield statement, it will run that code, as if it had never left the function.

To see that the code continues after a yield, you could put the if statement:

 if node.value:
    print "end of word", node.value

after the i-node loop. It will print the end of word, as if it is finding the end of the word as it exits that node on its way back up the tree, as shown here:

maria@lamia:~/python/test$ python
('letters', 'a')
('letters', 'n')
('letters', 't')
('letters', 'i')
('letters', 'c')
end of word 2
end of word 1

for words ant (1) and antic (2)

Switch the order, and now suddenly the world makes sense...

$ python
('letters', 'a')
('letters', 'n')
('letters', 't')
end of word 1
('letters', 'i')
('letters', 'c')
end of word 2

Let's think about our data a minute. I think the graphic on the Wikipedia site is a bit misleading. I think there should be lines drawn horizontally between nodes in the same branch, because it is possible, and necessary, to travel between nodes at the same level in the same branch. Like so:

When we ask for the "next item" using the items method in the case of no recursion, it gives us the next node at that level. But if we are inside our method, and ask for the next node (recursion), we are essentially in a node and asking for the next node of that particular branch. So, in the drawing above, if you are in the "A" node, and ask for the next node, then you get "N" as that is the next node under "A". But if you are not in the "A" node, then the next node is "B". When you get down to the "T" node, you will get an "E", "I" or "S" when you use recursion. But, if you are yielded the "I" and ask for the next letter without recursion, you will get the "S". This will happen when you are moving from one branch to the next, because the method will back out of all of the recursive calls, and then go to the next item, which will be the head of the next branch. There it will hit recursion again, and descend down the next branch. Note that our code is not going down branches simultaneously at this point. This is because we do not have a call to split the recursion to go down the separate branches at once. When it hits a branch, it just goes down the first one it finds. We can consider this in a future post.

Okay, that's quite a bit for one post. Next post, let's try to expand our items function to be more useful.

Many thanks to the Seattle Python Interest Group for helping me sort this all out!

Python: Recursion, Generators and Tries ~ Comments: 0

Add Comment

Python and Mac OS


Tech, Code, Python, Mac

by Maria on 17 Jun 2013 - 23:51  

Mac comes with Python, but it is slightly out of date, and Apple has made changes that may become problematic. We could just install Python directly using the Mac install, but installing additional packages becomes problematic, because this is generally done on the command line, and the Python available from the command line is the system Python, not the one you installed in the Applications folder. Additionally, when you are creating a new project, it is helpful to know what additional packages are necessary for that particular project. Homebrew and Virtual Environments gives you a nice solution to these problems.

We will use Homebrew to install Python and some other utilities. You can even install both Python 2 and Python 3. Python will refer to Python 2, and python3 will start Python 3. Homebrew is a package manager, and will not only install stuff, but keep track of what is installed, and allow us to upgrade and uninstall easily.

Then we will use virtualenvwrapper to set up our project. Virtualenvwrapper sets up a virtual environment that allows you to decide what additional module you want to use for your project. For stuff already installed, it actually creates a link, rather than re-installing the same modules over and over. So, it isn't going to fill your hard drive, but lets you isolate what you are using for a particular project.

To get started, open a terminal and run this to install Homebrew:

$ruby -e "$(curl -fsSL"

If you don't have your path set yet for /usr/local/bin (probably you do), do that now by adding the following to your path (.bashrc) and restarting your shell.

export PATH=/usr/local/bin:$PATH

Python actually comes with its own package manager, which allows you to install additional Python packages. You will still use Homebrew for other packages that you may want. Don't worry too much about using the two package managers. Homebrew knows about pip and its installations and they play nicely together. If a package doesn't install with pip, try brew, maybe it wasn't really a Python package. For more information about Homebrew and Python, see here.

Now we want to install Python and some other packages. Here is a list of packages that I generally find useful. It is a good idea to always update brew before installing or updating other packages

$brew update
$brew install python
$brew install nose # necessary for testing
$brew install gfortran # necessary for numpy/scipy
$brew install numpy
$brew install scipy
$brew install pillow # this will install the Python Image Library PIL
$brew install wxmac # very nice for making Python GUI applications that port to 
 Windows, Mac and LInux. [[]]
$pip install ipython # interactive python - bash in python! As well as other good  
 stuff... [[]]
$pip install matplotlib # make graphs! 

Virtualenvwrapper is just virtualenv with a wrapper around it. The wrapper provides some nice utilities to use the virtualenv. Use pip to install both:

$pip install virtualenv
$pip install virtualenvwrapper

After installing, add to .bashrc:

export WORKON_HOME=$HOME/.virtualenvs
source /usr/local/bin/

You will need to reload bash, easiest to exit your terminal and start a new one. Useful virtual environment commands:

$mkvirtualenv test#Creates a virtual envirornment for project test
$lsvirtualenv#See what virtual environments you have
$workon test#Enter a previously created virtual environment
$deactivate#Exit virtual environment (use system python again)

more at the virtualenvwwapper website.

Now when you want to work on a project you have created a virtual environment for, you just type workon and you will see a list of possible projects (or just workon test to work on project test).

So, now what? Well let's say we want to work on a Django project, we could do this:

$mkdir testproject
$cd testproject
$mkvirtualenv testproject
New python executable in testproject/bin/python2.7
Also creating executable in testproject/bin/python
Installing setuptools............done.
Installing pip...............done. 

We see that some things were automatically installed in our virtual environment, basically enough to install the stuff we need. Now while still in our virtual environment, let's install Django, and start our project.

$pip install django
$ startproject testproject
$git init
$git add .
$git commit -a -m 'Initial commit of testproject' 

What's all that git stuff? Git is version control software. Contrary to popular belief, you do not need a Github account to use Git. More information about Git and Version Control can be found here.

Okay, now we have the very beginnings of our new Django project. Next time we want to work on it, we just do:

$workon myproject

and we are on our way! If we want to see what packages we are using with our project, just use lssitepackages, while in our virtual environment. You can use this to create your requirements file so that your project will be easy to use or develop on anther machine or platform.

Python and Mac OS ~ Comments: 0

Add Comment

Mail Server


SysAdmin, Linux, Tech

by on 05 May 2013 - 00:44  

Notes from when I set up a mail server, cause you never know when notes like this will come in handy...

First a lesson about mail servers. A mail server is a combination of a bunch of software working together. The two main parts are the email server itself, for which I use postfix, and the imap server (or pop server), for which I use cyrus. Postfix sends and receives mail, and cyrus sorts and stores the mail. And then there are other pieces, depending on what you want to do. I use a combination of Amavisd-new, SpamAssassin, Razor, DCC, Pyzor and ClamAV for spam/virus filtering. I use Squirrelmail (I have actually since started using RoundCube, which I think has a nicer interface, and some users have found is less buggy) as a web interface to the mail server, doing basically the same thing that Outlook, Mozilla Mail, or Thunderbird do on your local machine, but it does it on the server through a web interface. Obviously, you also need a web server for that, which I am already running for our web pages. I use Apache. I have also set up Mailman for our mailing lists. For security, I use Openldap as the authentication system, and certificates, which is a whole 'nother topic.

Notes from setting up new mail server.

I made a special mail account named test. I am having all spam that gets a score greater than 10 + mails with banned contents or viruses sent to this account. This account has weekly rotation set up. Mail in this account is deleted after 4 weeks. Made a python script to rotate the spam and delete.

Tested getting rid of mail older than 20 days in user test using ipurge. Have to use -f option, but this also means it checks all folders under level requested, and if you are cleaning the inbox, this is all folders, so be careful! Must do as user cyrus if doing from command line: /usr/sbin/ipurge -d 20 -f user.test Worked fine, so added this to cyrus.conf in Events section:

purgetrash cmd="/usr/sbin/ipurge -f -d 14 *.Trash" at=0301

Which purges all messages older than 14 days, in all users' Trash folders and runs every morning at 3:01am. See the man pages for ipurge and cyrus.conf for more details.

mail set up based on:
+ cyrus instead of outside mail delivery
software needed to be configured for mail/web server:

  • backup - bacula
  • web:
    • apache
    • munin
      • /var/lib/munin
      • /var/log/munin
      • /var/run/munin
    • webalizer
    • squirrelmail
    • pmwiki
  • mail:
    • postfix
    • cyrus
      • to change logging for cyrus: /etc/default/cyrus2.2/
    • spam stuff:
      • amavisd-new
      • pyzor
      • razor
      • spamassassin
      • DCC
      • clamav
    • mailman - mailman is a pain moving from one machine to another, be careful of these directories:
      • /var/lib/mailman/archives /var/lib/mailman/data /var/lib/mailman/logs /var/lib/mailman/lists /var/lib/mailman/archives
  • security
    • denyhosts
  • dns server
    • bind
  • ldap
    • configure to use ldap server

good to know:

  • dealing with aliases: If you change the alias database, run newaliases
  • to deal with modules in apache use a2enmod module and a2dismod module

tweaking settings: If you want to configure your system to use more instances of amavisd-new, allocate at least 60MB for each additional instance. It you wanted to double the number of child processes from 2 to 4, you would edit amavisd.conf and change: $max_servers = 2; to $max_servers = 4; Then edit and change: smtp-amavis unix - - - - 2 smtp to smtp-amavis unix - - - - 4 smtp

Amavisd-new (SpamAssassin actually) will be the biggest bottleneck in the system. On a busy server you will probably want 2GB RAM so you can accommodate somewhere around 12 $max_servers.

If you run sa-learn --force-expire or spamassassin --lint -D or other spamassassin commands from the root account, SpamAssassin may change the owner of the Bayes files to 'root'. If it does, amavis will no longer be able to read those files. You would need to run chown -R amavis:amavis /var/lib/amavis to regain ownership. In general, if you do any spamassassin maintenance from the command prompt as root, the best thing to do is run chown -R amavis:amavis /var/lib/amavis afterwards; just to make sure. You can avoid these problems by remembering to run spamassassin commands as the amavis user. For example su amavis -c 'sa-learn --sync --force-expire'

This script does have some entries that are dependent on the version of SA. If you are not running SA 3.2.5, the script may need to be edited, and you must remember to edit this file when a new version of SA comes out: vi /usr/sbin/

Notice the lines that may need to be changed. Change 3.002005 if needed (3.3.0 might be 3.003000 for example):
rm -f /var/lib/spamassassin/3.002005/saupdates_openprotect_com/
rm -f /var/lib/spamassassin/3.002005/saupdates_openprotect_com/
rm -f /var/lib/spamassassin/3.002005/saupdates_openprotect_com/loadplugins.pre

Exit (or save) the file and run the script:

Bind: using jail now
munin: can change munin frequency in /etc/cron.d/munin,


testing and how-to stuff:

This excludes much the server says back to you...

server1:~# telnet 25
Connected to localhost.
Escape character is '^]'.
220 ESMTP Postfix (Debian/GNU)
ehlo localhost
250-SIZE 10240000
250 DSN
mail from:<>
rcpt to:<>
Hi John,

just wanted to drop you a note.

look at log file for postfix sending. This should not involve spam filtering. spam filtering is only through (14) not actually sure about port 25 on check to inbox

At the bottom of the above link are also hints about dealing with logfiles and backing up config files.

spamassassin -t < message.eml

to see more infos (what SA is actually doing)

spamassassin -D -t < message.eml

check website check squirrel mail

Mail Server ~ Comments: 0

Add Comment

PyCon 2013


Python, Code, MyRamblings, Tech

by maria on 29 Mar 2013 - 05:40  

I went to Pycon in Santa Clara this year, and really enjoyed it. I learned a lot, and made quite a few connections. First the unpleasantness, the Adria Richards debacle. Much has been written already, so I'll make this brief. Adria Richards tweeted a picture of two men who were making sexual jokes behind her during a talk at the conference. Whether or not Adria chose the 'best' course of action for pointing out inappropriate behavior at a tech conference is an open question, and quite frankly beside the point. She chose what she thought was the best tool at the time, and there is no way she could have predicted what followed. What followed was a massive onslaught of threats and insults that was completely beyond the pale and speaks miles about how much sexism exists in the tech community. The reaction of the tech community shows that this community can be a very uncomfortable and often downright hostile place for women, and when incidents like this happen, it makes me incredulous that some people still wonder why women leave the IT community. If you would like to read more, I recommend these articles:

If you want to be depressed about the general state of conditions for women in IT, check out geekfeminism.

And, now onto much better things. My favorite talk was 'The Naming of Ducks: Where Dynamic Types Meet Smart Conventions' by Brandon Rhodes. It was very informative, and done with humor and great slides. My biggest pet peeve about technical talks is the slides containing huge swaths of programs. Most of the room can't even read it all, and all of that code usually distracts from the speakers point, anyway. These were nice, small bits of code, stripped down to the bare essentials to make the point. His talk is up on the awesome pyvideo site:

And you can see the slides here:

Another favorite talk, which was just chock-full of useful tidbits was 'Transforming Code into Beautiful, Idiomatic Python' by Raymond Hettinger. Another engaging, humorous speaker. His talk can also be seen on the pyvideo site:

and his slides are also available:

I also participated in a couple of days of sprints, and based on my experience, I have some unsolicited advice for anyone wanting to run a sprint. The purpose of a sprint is two-fold. The current software developers on the project want to get a piece of software out there, and the new software developers want to help. That is the basic. It is hoped that everyone will learn something and have some fun as well. So, to accomplish this, the current software developers should do some homework before the sprint. If you actually want the new software developers to be able to help you, you must be able to get them up and running as soon as possible. Here are the most important steps, as I see it:

  • make development environment for your project easy to set up
  • document how to set up the development environment
  • Follow your documentation and seriously spend time installing on new computers and/or wipe out your environment and re-install a couple of times. Simplify the procedure, make instructions clearer, re-iterate
  • create documentation on how group uses versioning and software used
  • list out some tasks that need to be done, rate tasks by complexity and size
  • have an example of a test(s) to ensure that the nothing has been broken by the new code

I repeat, the more time you spend making sure collaborators can hit the ground running, the more help they can give you. This not only helps for the sprint, but will make your project more welcoming to potential contributors in general.

PyCon 2013 ~ Comments: 0

Add Comment

Charlie Rose


Health, Code Tech, Science

by maria on 26 Mar 2013 - 22:01  

I was on TV! Well, not really. My boss, Michael Shadlen, was on the Charlie Rose Show with Eric Kandel of Columbia University, Walter Mischel of Columbia University, Daniel Kahneman of Princeton University, and Alan Alda, host of the upcoming PBS program, “Brains on Trial”. But he showed some movies I made, so that's cool. There is a bit more information about the movies I've made for him here. The Charlie Rose show he is on is called "Public Policy Implications of the New Science of Mind" and it is part of his Brain Series. The whole show is very good, and I encourage you to watch all of it, but if you want to see the part where Mike talks about our research and shows the movies, go to 37:20. I created these movies by importing the experimental data into ActionScript and coding a re-creation of the experiment with the eye position of the animal superimposed on the re-creation of what she/he was seeing on the screen during the task. The spike train was added to the video, both visually and audibly, so you could get an idea of what was going on in the brain at the time. Link to Charlie Rose Show

Charlie Rose ~ Comments: 0

Add Comment

openldap configuration


Linux, SysAdmin, Ldap, Tech

by maria on 08 Mar 2013 - 01:27  

It use to be you edited slapd.conf to change configurations, but now openldap has its configuration in the database itself. Which means you can change the configuration without having to restart ldap. Or you can totally screw it up while its running. Cool. At any rate, the guides all helpfully say, now you change your configs just like you change anything else in your database by ldapmodify. Great. But, wait, how do I do that exactly? Well, first you need to know the cn, which isn't going to be the same as the changes you usually make. Fortunately, it is easy, cn=config. For an example, what if you want to check what schemas are loaded?

ldapsearch -LLLQY EXTERNAL -H ldapi:/// -b cn=schema,cn=config dn

The -L option used here causes the results to be displayed in LDAP Data Interchange Format (LDIF). A second -L will disable comments, and a third one will prevent the LDIF version from being printed. The default is to use extended LDIF. The -Q will enable SASL Quiet mode. Never prompt. The -Y will specify we want the EXTERNAL (usually TSL) authentication mechanism. If you want to see the entire structure of the schemas, omit the dn from the end.

To get a nice overview of the configurations:

ldapsearch -LLLQY EXTERNAL -H ldapi:/// -b cn=config "(|(cn=config)(olcDatabase={1}hdb))"

When I first started using the new configuration scheme, I could not authenticate to make any configuration changes, even though I could still make changes to our database. Something happened when I converted from ldapd.conf to the new configuration, so that I got this error message when trying to authenticate:

LDAP "Invalid credentials (49)" for cn=config (10.04 svr)

The problem is that normally I use cn=admin for making database changes, but to make configuration changes you have to use the olcRootDN, cn=admin,cn=config.

If you search by authenticating as config, you can tell if you will have this problem:

ldapsearch -xLLL -b cn=config -D cn=admin,cn=config -W olcDatabase={1}hdb

To fix this I re-created the olcRootDN and password. First get the encrypted version of your password by running this command:

slappasswd -h {MD5}

Type your password twice and copy the result in to a file with the following contents (I used emacs config.ldif):

dn: cn=config
changetype: modify

dn: olcDatabase={0}config,cn=config
changetype: modify
add: olcRootDN
olcRootDN: cn=admin,cn=config

dn: olcDatabase={0}config,cn=config
changetype: modify
add: olcRootPW
olcRootPW: {MD5}your password here

dn: olcDatabase={0}config,cn=config
changetype: modify
delete: olcAccess

Include the {MD5} part before the actual password. The delete: olcAccess is so that users other than root can have administrative access. Now load the config file:

ldapadd -Y EXTERNAL -H ldapi:/// -f config.ldif

One thing I wanted to do was change what was being indexing, so I created a file index.ldif

dn: olcDatabase={1}hdb,cn=config
changetype: modify
add: olcDbIndex
olcDbIndex: uid eq
add: olcDbIndex
olcDbIndex: cn eq
add: olcDbIndex
olcDbIndex: uidNumber eq

and loaded that:

ldapmodify -QY EXTERNAL -H ldapi:/// -f index.ldif

I decided what to index on by looking at the log files to see what was being searched on, but wasn't indexed.

There was something else strange that happened when I upgraded, the shadowLastChange attribute was missing from all of the people. So, I tried adding it back using this in my change.ldif file:

dn: uid=jd,ou=people,dc=example,dc=com
changetype: modify
replace: shadowLastChange
shadowLastChange: 15771

I got the following message:

add shadowLastChange:
modifying entry "uid=jd,ou=people,dc=example,dc=com"
ldap_modify: Constraint violation (19)
        additional info: attribute 'shadowLastChange' cannot have multiple values

That's strange. So, maybe it thinks it already has that attribute. let's see what happens if we try to modify it instead of add it:

replace shadowLastChange:
modifying entry "uid=jd,ou=people,dc=example,dc=com"
modify complete

Huh, well that seemed to have worked. Let's see what the value is now.

annette:~# ldapsearch -x "uid=jd" shadowLastChange
# extended LDIF
# LDAPv3
# base <dc=example,dc=com> (default) with scope subtree
# filter: uid=jd
# requesting: shadowLastChange 

# jd, people,
dn: uid=jd,ou=people,dc=example,dc=com

# search result
search: 2
result: 0 Success

# numResponses: 2
# numEntries: 1

Um, so where is it? Interestingly, if I do stop ldap and do a slapcat, it shows up in the ldif file. So, it is there, I just can't see it using ldapsearch. Let's check permissions:

ldapsearch -LLLQY EXTERNAL -H ldapi:/// -b cn=config "(|(cn=config)(olcDatabase={1}hdb))"

olcAccess: {0}to attrs=userPassword,shadowLastChange by self write by anonymous auth by dn=
 "cn=admin,dc=example,dc=com" write by * none
olcAccess: {1}to dn.base="" by * read
olcAccess: {2}to * by self write by dn="cn=admin,dc=example,dc=com" write by * read

So, for userPassword and shadowLastChange the last permission is by * none, which effectively means that no one can read it. That is what we want for userPassword, but not what we want for shadowLastChange. Hmmm. So, now we learn how to change permissions. Probably it is easiest to just make separate entries for userPassword and shadowLastChange. So, I created this file:

dn: olcDatabase={1}hdb,cn=config
changetype: modify
delete: olcAccess
olcAccess: {0}
add: olcAccess
olcAccess: {0}to attrs=shadowLastChange by self write by anonymous auth by dn="cn=admin,dc=example,dc=com" write by * read
delete: olcAccess
olcAccess: {1}
add: olcAccess
olcAccess: {1}to attrs=userPassword by self write by anonymous auth by dn="cn=admin,dc=example,dc=com" write by * none
delete: olcAccess
olcAccess: {2}
add: olcAccess
olcAccess: {2}to dn.base="" by * read
add: olcAccess
olcAccess: {3}to * by self write by dn="cn=admin,dc=example,dc=com" write by * read

Order is important, which is why I rewrote all of the olcAccess attributes. Also syntax is important. Make sure you are using the correct dn, have the changetype, etc. Now implement your changes:

ldapmodify -QY EXTERNAL -H ldapi:/// -D cn=admin,cn=config -f change.config

And that should do it. But, no, now even though I have these permissions:

olcAccess: {0}to attrs=shadowLastChange by self write by anonymous auth by dn=
 "cn=admin,dc=example,dc=com" write by * read
olcAccess: {1}to attrs=userPassword by self write by anonymous auth by dn="cn=
 admin,dc=example,dc=com" write by * none
olcAccess: {2}to dn.base="" by * read
olcAccess: {3}to * by self write by dn="cn=admin,dc=example,dc=com" write by * read

I still can't read the shadowLastChange.

Aha! And now another lesson in openldap permissions. Order is important. I have stated in my permission for shadowLastChange 'by anonymous auth', so if I try to look at shadowLastChange anonymously, we come to this directive and stop, cause it looks like we don't have permission to read. Doesn't matter that later on, we say that anyone can read. So, be careful when you are adding a new directive that you put it in the correct place, and that there isn't a directive there already that contradicts it. In this case, the way I added the new rule 'by read *', I was effectively saying, everyone but anonymous has read permission. If we move the directive 'by read *' earlier, we give anonymous read permission by giving everyone read permission, and it makes no sense to then say they have auth permission, which is more restricted access. So, we drop that directive altogether. Here is my new configuration:

olcAccess: {0}to attrs=shadowLastChange by self write by dn="cn=admin,dc=example,dc=com" write by * read
olcAccess: {1}to attrs=userPassword by self write by dn="cn=admin,dc=example,dc=com" write by anonymous auth by * none
olcAccess: {2}to dn.base="" by * read
olcAccess: {3}to * by self write by dn="cn=admin,dc=example,dc=com" write by * read

And, seems like a good idea to post the who and the what:

*All, including anonymous and authenticated users
anonymousAnonymous (non-authenticated) users
usersAuthenticated users
selfUser associated with target entry
dn[.<basic-style>]=<regex>Users matching a regular expression
dn.<scope-style>=<DN>Users within scope of a DN

Access Levels:

none=0no access
disclose=dneeded for information disclosure on error
auth=dxneeded to authenticate (bind)
compare=cdxneeded to compare
search=scdxneeded to apply search filters
read=rscdxneeded to read search results
write=wrscdxneeded to modify/rename
manage=mwrscdxneeded to manage

Of course, all of this information and more is in the docs

openldap configuration ~ Comments: 1

Add Comment

Apache and SSL


Tech, Apache, SSL, PHP,Linux,SysAdmin

by maria on 01 Mar 2013 - 22:51  

Once upon a time we had set up our web server so that we had a secure VirtualHost pointing to a subdirectory of /var/www, so the DocumentRoot for the secure site was /var/www/https. This worked just fine. Some content was just for our own use, and required secure login, and some content was for the general public, and did not. Then we decided it would be better if our website was a wiki, so that everyone in the lab could update the website. Since our website was initially not a wiki, and we wanted to save some content from the old site, we had set up the wiki as a subdirectory of /var/www. The wiki had some links to the old content. Once we got the wiki up and running, we decided it would be good if the login to the wiki was over SSL.

To do this, we needed the main /var/www directory to allow https access sometimes, but not always. If you always run everything over SSL, there is larger overhead, and pages are likely to load more slowly. So, we can just make the DocumentRoot for both /var/www and use code to switch between SSL and non-SSL, right? I assumed that pages would be served by HTTP unless SSL was requested, but it turns out at least one browser I know of will choose SSL over non-SSL if both are offered by the server. Which means that the dumb solution of just checking to see if the person requesting a page is trying to edit or login and then requiring SSL wasn't going to work, as general public looking for our website was just as likely to be given a login window as site content. This could be dealt with on the wiki, since it runs on PHP, but was not clear how to do deal with the rest of the website.

So, maybe we just want to worry about the wiki having SSL login, since that is the only place needing it. Anything else on the website has to be edited directly on the server. The alias directive seemed a good solution. This allows you to add content not under the document root to be served as part of the document tree. You enable the mod_alias module by the following commands

a2enmod alias
service apache2 restart

Now, I am using CleanURLs for my wiki, so I needed to figure out how to set this up when using SSL. I don't know of a way to do and if statement in htaccess to check if someone is using SSL, but a side effect of having different root directories for the normal and SSL site is that I could just use 2 different htaccess files. One is for the non-SSL site:

RewriteCond %{HTTP_HOST} ^
RewriteRule (.*)$1 [R=301,L]

and the other for the SSL site:

RewriteCond %{HTTPS_HOST} ^
RewriteRule (.*)$1 [R=301,L]

And the Rewrite Base and Rewrite Rules are going to be slightly different.

So, now for the PHP solution. I started by using a recipe on the PmWiki website for enablingSSL for the initial log in. Really, this seemed to be the only time that SSL was really necessary. But, there seemed to be no memory in the PHP code of logging in via SSL when we returned to HTTP, so after the initial logging in, if you tried to edit a different page, you were asked for a password again. Well, that got old fast, and hinted that there was a possible security hole. The most sensible solution seemed to be to have the whole session using SSL, but there was no obvious way to do this from the recipes available on the PmWiki site. So, I went on the PmWiki user mailing list to try to figure out how to adapt one of the recipes for my purpose. In the end, I used a combination of a hint from Patrick Michaud and a recipe by jtankers. If your interested, you can see my code pmwiki ssl

And, indeed, it seemed to work fine, mostly. But, remember how alias adds content not under the document root to be served as part of the document tree? This means that if you try to see content above the alias (ie. something in /var/www), according to the secure site configuration, apache should look in /var/www/https, since this is the root for the secure site. So, stuff that had been linked to in the wiki from other directories in /var/www was not showing up in the secure site. I managed to solve this by changing paths in the PHP code that runs our wiki.

More Apache hints, in no particular order:

  • MultiViews requires that files are owned by the group that apache runs as, in my case www-data, and that permissions are set to 770:
    1. chgrp www-data /var/www/ -R
    2. chmod 770 /var/www/ -R
    When you edit files, make sure you are a part of the correct group (newgrp www-data) or that you change the group after you edit.
  • if you are using Clean URLS, you need to have AllowOverride set to All
  • not strictly an Apache hint, but if you want to post an email address on a website in such a way it is unlikely to be found by bots, use an ascii to html code converter (converter apps available on the web, just search), and enter the email address with html code (looks like test). To be extra careful, add spaces around the @ symbol.
  • mod_rewrite can do everything mod_alias can do, and a lot more.
    • Use mod_alias (Redirect in htaccess) when you can because it is cleaner, uses less cpus and overhead, and easier to figure out what is going on when looking at configuration after the fact.
    • Use mod_rewrite when you are trying to stop things from displaying in the url bar.
    • rules in .htaccess are executed in order, however, Rewrite has priority over Redirect.
    • more excellent hints on mod_rewrite and mod_alias can be found here

Apache and SSL ~ Comments: 0

Add Comment

Thanks Mike!


Tech, MyRamblings, MyLife, Kids

by Maria on 28 Feb 2013 - 01:38  

I thought about waiting another couple of weeks before posting, so it wouldn't be so obvious that I have been seriously delinquent with posting, but heh, life happens. And lots of life has happened. For starters, my son was born last April. Insert requisite photo here:

Attach:bash.jpg Δ

So, for a while I chose sleep over blogging. So goes it.

Another life that happened is that my boss moved to Columbia University in NY. We decided not to follow him, so I have been slowly beginning the job hunt process. I'm nervous and excited. Looking forward to the new challenge. I've learned so much working for Mike. It has given me a sampling of all kinds of stuff, and allowed me to recognize what I am truly interested in, programming, while giving me a broad base of skills that complement coding.

Mike, thanks so much for giving me the opportunity to work in your lab. It has been a great adventure. I was given freedom to build what I thought was needed. A diverse and awesome team to work with. Guidance when it was needed. Fun and interesting projects to work on. I loved making movies for your talks. Creating movies that were a truly accurate representation of the lab experiments was challenging and intriguing. Working in neuroscience, I practiced the scientific method. As a system administrator, I gained awesome problem solving and troubleshooting skills. Working on projects while being responsible for the day to day running of the servers and being a consultant to others in the lab (and in the neuroscience community in general) taught me organizational skills and improved my communication. Not to mention the value of good documentation and testing, testing, testing. And, of course, a greater appreciation of soccer and jazz.

As this job ends, I have also decided to combine my tech writing and my blog writing more, instead of having a separate part of my website for "work", which often just ended up being "anything tech I ran into and found interesting or hard to figure out and wanted to document". I was reluctant to write tech stuff in my blog, because that is often a work in progress, and I was trying not to edit my blog entries (much), once they were written. But, heh, this is my blog, so I can make the rules, and I'd rather start using the blog for tech stuff as well as life ramblings so that I can take advantage of keywords and stuff. So, there you go, some blog entries are going to evolve over time as I learn.

And, for some inspiration:

To me, coding is writing stuff that makes computers come to life. In the Wizard of Oz the wizard is seen as a fraud, creating smoke and mirrors to hide that he is an ordinary man, but I like to think of it as being the ordinary person that is proud to be the wizard that by just writing "stuff" makes wondrous things happen.

Thanks Mike! ~ Comments: 0

Add Comment

Adobe, A rant


MyRamblings, Tech

by Maria on 03 May 2010 - 15:53  

I have enjoyed using Flash for quite a while. Mostly I use it to make movies for work, but have been playing around with it lately to make more creative animations. So, when I heard the news that Microsoft Agrees With Apple And Google: "The Future Of The Web Is HTML5", I was a bit dismayed. Now I have long believed that flash was wrong for creating websites, but thought it would remain the standard for video and games. And, I believe that will still be true for a while. if you look at the demo page for html5, you will see that most of the demos are using things useful for building interactive websites, but not any of them demonstrate animation created by HTML5. Even the stuff that will eventually be used to create online games is pretty crude yet. Not sure what the Canvas demo does, since I couldn't get it to load with any of the 3 browsers I tried. So, I think we are a ways yet from animation and online games with HTML5. However, given my recent experiences with Adobe, I am thinking about learning HTML5 now anyway, even though I will be much more limited in what I can do, because I am sick of Adobe. Adobe Tech Support sucks! Not to mention their programs are getting to be so bulky and buggy they are painful to use.

My Adobe saga:

Part 1:

Saving a pdf without comments.

I thought this was pretty straightforward, but I had to repeat what I was trying to do 4 times before they gave me a solution. All I was trying to do was to hide/get rid of the comments in a pdf that was being sent in an email. They gave me solutions for how to use comments for an email review, told me how to hide comments from my current view, etc. They even wanted me to send a pdf with comments in it, because that was somehow going to help them understand what I wanted. Hello, you are Adobe, surely you have a pdf with comments in it laying around on your desktop?!? Finally, after 6 emails from Adobe, they gave me the solution. For those curious, here is the highly intuitive solution:

Go To Advanced-> PDF Optomizer->Discard User Data->Check the Tab Discard All Comments , forms and multimedia.-> Click Ok.

Now save this pdf with a different name, and you can send your pdf itinerary to your boss, without your comments about meeting your colleagues after the meeting for drinks. Are we really the only people who find this useful?

Part 2:

Upgrading the Organizer in Acrobat

My boss upgraded from Acrobat 8 to Acrobat 9. When he tried to open the organizer in Acrobat 9, the window was missing, and it was apparent from the menu that nothing from Acrobat 8 had been moved over. I sent in an email request for help, but was told this was not an installation issue, and I needed to have bronze support. I tried calling them, spent eons on hold, just to have them tell me, once again, that this was not an installation issue so I needed to pay for support. Not an installation issue? I installed the software, and it didn't work, and didn't import stuff from the last version. How can this be anything except an installation issue?!? So, I went off in search of paid support. Buying support from Adobe is convoluted, especially if you have a volume license. Supposedly there are support packages, where you get so many support calls per year, or maybe some number of support calls, but I never did figure this out. Nor did I figure out what bronze support is. Since there is a new version of Adobe products coming out, I decided it was probably best to just buy one support instance, especially since given the cost of my time doing research trying to figure this shit out, it would probably be cheaper to pay by the instance anyway. So, I spent another 2 hours on the phone, mostly on hold, during which I solved the missing Organizer window problem without any help from Adobe. When they told me the import problem wasn't an installation issue, I said fine, I'll pay. They ended up not charging me, although they lectured me on how this was an Acrobat 8 issue (since I was trying to export from Acrobat 8), so next time they would charge me. Like what, I didn't buy Acrobat 8 from them, and the reason I was trying to export was to have a WORKING INSTALLATION of Acrobat 9? WTF? But then, in the end, they told me it was impossible. You cannot get your Organizer settings from Acrobat 8 to Acrobat 9. I filed a bug report. I had already sent a letter to the CEO complaining about their tech support, but maybe I should send him an addendum?

Adobe, A rant ~ Comments: 0

Add Comment

Red Herrings


Health, Politics, MyRamblings, Tech, Science

by Maria on 13 Apr 2010 - 21:10  

Herring (Kippered)

I very much enjoyed the TED talk by Michael Specter on the danger of science denial. His main point is that we will continue to do real damage to our planet and our communities, if we continue to ignore what science tells us. His two main examples are the trend to not immunize because of the supposed link between autism and immunizations, and frankofoods, iow, genetically modified foods. I think both of these cases demonstrate the publics tendency to take a scary finding, latch onto the first thing that comes along to blame, and then ignoring science and facts and beat the hell out of the red herring. In the case of the autism and immunizations, study after study has shown there is no link. But the original study, however misguided, did demonstrate that we need to continue to put pressure on manufacturers and the government to ensure that vaccines are safe to use, as some things were brought up that were questionable. We need to learn to accept science and facts when they become undoubtable, stop beating a dead horse, and look to new places for answers. That second point is very important. There is much money and time now being spent trying to convince parents that autism is caused by immunizations, money that should be spent on coming up with the actual causes and cures to autism. Not to mention this misguidedness is causing a crisis in immunization that could cause many diseases that we have not seen in decades to return to the United States. If you are unconvinced that immunizations do not cause autism, check out this pdf from

The second issue, genetically modified foods, is very interesting. In this case, the red herring is GMO's themselves. Although more research is needed, so far, it appears that the insertion of new genes does not, by itself, change the plant in a negative way. In Specter's talk he mentioned the noble ideas about adding vitamin A in rice and adding protein and vitamins in cassava, using genetic modification. He did not mention anything about adding resistance to pesticides or insecticides. These are the truly scary things, the things we should be up in arms about. The movie Monsanto's World is extremely interesting, and brings to mind the things we need to be extremely concerned about. First and foremost, are the ties between government and corporations. Monsanto has become a scary monopoly because the US government let it happen, and, in fact, encouraged it to happen. And, it can, and probably has, happened in other industries as well. It is the ties between industry and government that has caused the scientific data to not be scrutinized as it should be. Check out the wikipedia article about Monsanto, under Public officials formerly employed by Monsanto. Which brings up and interesting question. Who should be in charge of government agencies that oversee industries? In many cases, it seems the government decides that people from industry are the best choice, since they would presumably know the most about that particular industry. But, they also have the hardest time separating themselves from the corporations they use to be a part of, and present a real conflict of interest. Time after time, in many different industries, government has failed to enforce or enact the regulations it should in the interest of public safety, because of the ties with corporations. The other thing that we should be up in arms about is the abuse of patent law by Monsanto. Monsanto has used patent law to bully farmers, so that it now controls most of the U.S. corn and soy seed market, according to the non-profit Center for Food Safety. And there is no doubt that Monsanto and its connections in government have worked hard to suppress scientific evidence that its products are not as harmless as it claims. But, you shouldn't take my word on this, do your research. So, while I agree with Specter about there being good that can come from genetic modification, and while at its root, it is not much different from the modifications we have been making to animals and plants for hundreds of thousands of years by breeding, there is still some very scary stuff going on in the genetic modification industry, and most of it has to do with the corporation that controls a very large portion of the seed market, Monsanto, and allows farmers to completely douse their fields with herbicides and/or insecticides. And regardless of whether the food that has been modified to survive such dowsing is harmful, we already know that dowsing fields with herbicides and/or pesticides is terrible for the soil and the nature/people surrounding the fields. For the most common of these herbicides, Roundup, check out the wikipedia article.

Which brings me to another interesting article I have read recently. In the article Is it okay to ignore results from people you don't trust? by Ben Goldacre on He gives a nice example of industry scientists getting the results you would expect them to want, which was different from what non-industry scientists found. Repeated experiences like this makes it is easy for us to ignore results from people we don't trust. We have come to expect scientists from industry to get results more favorable to their industry (which is why the government should have been more critical of the data from Monsanto), but then he goes on to give an example of researchers you may not normally trust, publishing a study with a result that was both accurate and earlier then any other researchers. So, it appears that it is not enough that the public pay attention to scientific data, the public must learn to think critically about the data that they are given. Consider the source, but also consider the data itself. Ask questions. Be skeptical, but do not reject science simply because you want to believe in voodoo. And above all, do not look for studies to validate your opinion, because you will find them no matter how crazy your opinion is. Instead, look at everything you can find that examines the question with an open mind, consider the sources, the methods, the number of studies, and ask questions until you are satisfied. But when some new piece of evidence comes up, be willing to look anew at the question, and to reconsider your position. Yup, it is a lot of work, but it is so very important to our health and the health of our planet.

Red Herrings ~ Comments: 0

Add Comment

Ada Lovelace Day


Lovelace, Tech

by maria on 25 Mar 2010 - 01:30  

Today is Ada Lovelace Day once again, and I thought today I would spotlight a modern day techie entrepreneur. Cathy Malmrose started her own business selling hardware running linux in 2007. She impresses me not only because I am awed by people willing to start their own business, but also because she was discouraged from anything technical or scientific as a child. It took her a long time to overcome this discouragement, but she has in a big way, and now is an inspiration to girls and women interested in science and technology. I just love her journal entry about her girls learning how to install linux on a computer.

Nelson Mandela is an inspiration to her, and the name of her company, ZaReason comes in part from Za, the country code for South Africa, and Reason, "which translates well in many languages, and has many meanings". I love that she decided to include a screwdriver with all ZaReason computers to "communicate that we respect people's ownership of their new laptop or desktop and we respect their intelligence to be able to modify it."

Cathy is also involved with charitable projects through a non-profit,

LInks about Cathy:

Ada Lovelace Day ~ Comments: 0

Add Comment

Week in Review


Health, Politics, Tech, Science, Videos

by maria on 31 Jan 2010 - 20:04  

Lots about death this week, but lets start with autism. Andrew Wakefield, the doctor who supposedly linked MMR and autism, is closer than ever to being banned from practicing as a doctor, according to NewScientist. Apparently the ban (on him and two co-authors) doesn't actually have to do with the autism claims, but has "concerned itself with the conduct, duties, and responsibilities of each doctor". However, the findings of the investigators does seriously call into question his integrity as a scientist as well, apparently peppered with words such as "dishonest", "irresponsible" and "misleading". It is so sad the panic this mans irresponsible claims have caused over immunizations. While true that the attention over this has caused manufacturers and regulators to pay more attention to the safety of vaccines, which is very important, it has also meant much valuable time and resources have been spent disproving this link. Time and resources that should have been going to investigate, more likely links.

Continuing on to the death theme, we move on to a very concerning development with the "suicides" in Guantanamo back in June of 2006. I highly recommend reading the Harper's article in full, but if you want the short version, watch the video at the bottom of the update. I am sickened by our government, and hope that the Obama administration will do the right thing, and come clean with all that has happened, before and since, they came to power, regarding Guantanamo and the policies of torture by the USA.

This afternoon I read an article in The New Yorker about dying and mourning. I had already been thinking about death after hearing an amazing podcast from Radio Lab. The 8th segment, at about 13:30, is a story by David Eagleman from his book, SUM, read by Jeffrey Tambor. I recommend listening to the entire hour, but this is the story that got me thinking down this particular line. It is sort of an echo of something that I had been thinking about, although better articulated than I could have done, and it's kind of a natural continuation of my thoughts about emergence. It is the thought that there is a connection that we all have at many levels. There is the connection between our atoms, molecules, cells and cell structures, organs, organisms, planets, etc., which form groups at various levels. Maybe it is true that at each level there is some awareness of the interconnectedness, and some feeling like loss when the group breaks up. Strange that a type of mourning that may happen to my atoms when I die is a comfort to me, and whose to say there is no awareness in atoms or planets? Next thing you know, I'll be following the Church of the Flying Spaghetti Monster. I do recommend the article in The New Yorker about dying and mourning, and which has nothing to do with the Flying Spaghetti Monster. I agree with Meghan O'Rourke, I think we do not do the death and mourning thing well in the USA.

Before we leave the death theme, I'd like to take a moment to join many fans, friends and family in the mourning of Howard Zinn and J. D. Salinger. Both made amazing contributions to our society, and I am very grateful for their lives, loves and works.

On the tech front, a scary thing happened with Facebook on AT&T phones. Apparently last weekend, some people with AT&T phones logged into Facebook, and found themselves in someone else's account. There is a good, but somewhat technical, article about what happened and what needs to be done about it at the Electronic Frontier Foundation.

As a reaction to the crazy ruling recently by the Supreme Court, Murray Hill Inc. is running for congress. Hmmm.

Interesting article about skunk weed. According to the article, "studies have shown that pure, synthetic THC causes transient psychosis in 40 to 50 per cent of healthy people". Apparently, there is normally a compound in weed, cannabidiol (CBD), that counteracts the psychosis producing effects of THC. Guess we should stick to the other strains...

Finally, time for some fun. Start with the Ultimate Graphic Novel (in Six Panels). The first comment was almost as good as the novel. Also discovered a great music site,, and found a cool new video, Anna Rose "Picture":

Week in Review ~ Comments: 0

Add Comment




by maria on 27.05.2009 - 16:30  

What a struggle. Normally I post things like this in my Work directory, but I wanted people to be able to post comments, so I'm posting here. Consider it a constant work in progress, as I will continue to learn about ldap.

ldap running on port 389 can use tls or not. This is so you can do anonymous binds (as far as I can figure, there is no reason to require an encrypted connection to find out public information when not using a password). So what I want to figure out is if the ldap server can require tls for all queries that require a password. Presumably, we have already decided which ldap entries are private enough to require a password with slapd.conf ACLs.

rootdn can be used for initial setup, but best to setup a user in the database to be admin, and then get rid of rootdn.

Adding ssl start_tls to ldap.conf seems to disable anonymous binds.

test gnutls:

on server:

gnutls-serv --x509certfile /etc/ldap/certs/server.crt \
            --x509keyfile /etc/ldap/certs/server.key

on a client (needs gnutls-cli and cafile): gnutls-cli --x509cafile /etc/ssl/certs/ca-cert.crt

will give cert info:

openssl x509 -in /etc/ldap/certs/ldap.cert.pem -text -noout

test tls with ldap:

ldapsearch -x -ZZ -d 255

ldap error codes:

LDAP ~ Comments: 0

Add Comment

Email is Evil



by maria on 22.04.2009 - 01:45  

Recently I came across this blog post, The Email Problem and How to Solve it. I wouldn't say there were a whole lot of solutions presented, but it got me thinking about how much time I spend checking email. Part of the problem is that many of us are tied to email in a fundamental way for our jobs. Right now I am having problems with computers overheating, so if I am not right next to the computer room, and checking the temperature as regularly as I check my email, then I do need to be checking my email often, to make sure I am not getting an email from the server telling me, "hey, I'm overheating, do something!" My job being computer admin, there are all kinds of reasons why I have to be checking my email regularly. But it is a major distraction, and it is often difficult to get back on task after checking it. I think one solution that the author did not really touch on is to expand the tagging ability of emails and the filtering of alerts. Right now we can tag emails after we have read them as important, but other than using capital letters in the subject, it is difficult to tag an email before you send it. It would be great if I could filter my email alerts to critical sometimes, kind of like a 'do not disturb except during emergencies' sign, and only be told about servers overheating or other true emergencies when I am in the middle of some programming that takes all of my concentration. Or even creating different levels of alerts would be a good start, for example, when I am in critical work mode, I only want to know about emails from my computers and my boss.

Email is Evil ~ Comments: 1

Add Comment

Ada Lovelace Day


Lovelace, Politics, Tech

by maria on 24.03.2009 - 13:42  


It has become obvious that women need to see female role models, in order to persevere and thrive in male-dominated fields. So, Suw Charman-Anderson announced that she would post a blog if 1000 other people also promised to post a blog about a woman they admire who has excelled in technology. She is calling it Ada Lovelace Day in honor of one of the world's first computer programmers.

So, I signed the pledge, and talked to my daughter Tanika about it. She told me that a woman invented the dishwasher, which I did not know, and recommended I look into that. Josephine Cochrane did invent the first workable mechanical machine to wash dishes. Apparently she had grown tired of her servants breaking her dishes, and is quoted to have said, "If nobody else is going to invent a dishwashing machine, I'll do it myself." I love it. She designed a wheel that set inside a copper boiler, and held several different compartments made of wire to hold different types of dishes. A motor turned the wheel and pumped hot soapy water from the bottom of the boiler. I love the image I have in my head of her in the early 1880's at work in the shed behind her house, hammering pieces of hardware to a copper wash-boiler. She received a patent for it in 1886, and founded the Garis-Cochran Dish-Washing Company to produce it, which later became the KitchenAid part of the Whirlpool Corporation. Another great quote: “Women are inventive, the common opinion to the contrary notwithstanding. You see, we are not given a mechanical education, and that is a great handicap. It was to me—not in the way you suppose, however. I couldn’t get men to do the things I wanted in my way until they had tried and failed in their own. And that was costly for me. They knew I knew nothing, academically, about mechanics, and they insisted on having their own way with my invention until they convinced themselves my way was the better, no matter how I had arrived at it.” Things were definitely difficult for women in the late 19th century, both as inventors and business owners, and she should be applauded as much for her bravery in getting into business as she was for the invention itself. Another quote, regarding the first sale she made, to a large hotel in Chicago, “You asked me what was the hardest part of getting into business,” Mrs. Cochrane recalled for the reporter for the Record-Herald. “That was almost the hardest thing I ever did, I think, crossing the great lobby of the Sherman House alone. You cannot imagine what it was like in those days, twenty-five years ago, for a woman to cross a hotel lobby alone. I had never been anywhere without my husband or father —the lobby seemed a mile wide. I thought I should faint at every step, but I didn’t—and I got an $800 order as my reward.”

picture and some background from Hall of Fame inventor profile
other sources:
American Heritage Profile
University of Houston profile

Ada Lovelace Day ~ Comments: 0

Add Comment

Cool new toy


Science, Tech

by me on 21.05.2008 - 15:20  

This just looks like a bunch of fun:

In completely unrelated news, I also liked this post about the theory of evolution:

Finally, here is a cool black hole demo.

Cool new toy ~ Comments: 0

Add Comment

geeking out



by Maria on 21.11.2007 - 03:19  

The other day, David told me that there existed some sort of external tray that you could set internal drives into so you could access them easily. I got very excited about this, so I looked around. I found something very sweet:

You can hook up the hard drive to this cable and plug it into a usb port. How cool is that? That will tremendously cut down on the time I spend opening computer boxes and moving hard drives around! Can't wait to get it.

Speaking of playing with hard drives, I finally did some research about ide/ata hard drives and cables. I use to think there was something wrong with some of my motherboards and/or cables, because of strange behavior regarding what drive was being considered the boot drive, or even whether drives would show up in bios. Turns out there are two different kinds of ide/ata cables. Some of them don't respect the cable select setting. Sometimes adding a hard drive to a cable causes the position of the master to change. So, if you thought you could just add in a second hard drive, boot from the original, and check to see what was on the second hard drive, nope, now it will try to boot from the second hard drive. There is a good reference about how the cables work here:

I am deeply embarrassed by how long it took me to look this up, but very glad I finally did. Explains a lot... I think part of the reason that it took so long is that most of the time when I am dealing with moving drives around it is when I am dealing with some failure, and so I assumed it was part of the failure. It looks like has a lot of good information about all kinds of computer hardware.

geeking out ~ Comments: 0

Add Comment