Tuesday, October 25, 2016

States of the Arts

Fall 2016 has seen important improvements to the state of the art in both speech synthesis and speech recognition.

In September, Google DeepMind unveiled WaveNet, a speech synthesis system, that generates exceptionally natural sounding speech. In October, Microsoft Research announced that they had developed a speech recognition system that matches the word error rate of human transcribers.

A few observations.

Life moves pretty fast.

It’s a time of rapid progress in speech and spoken language processing. Microsoft, IBM and Baidu have all posted better and better speech recognition numbers in the last few years.

Deep Learning has the goods.

It’s very easy to be dismissive of “deep learning” as being over-hyped. However, both of these advances rely heavily on deep neural networks. So far, they continue to deliver on their promise.

Arxiv.

One of the first important ASR papers showing that DNNs can outperform traditional GMM acoustic models on a hard task (i.e. Switchboard) was presented at Interspeech 2011. This means the work was done at least 6 months earlier. Both of these advances are described not only by press releases and glossy webpages, but also technical papers [WaveNet paper, MSR paper]. Both were posted to arxiv. There’s no doubt that immediate, self-publication is flooding the scientific engine with oxygen. Progress is rapid because we’re still learning the limits of neural networks, but also groups are able to compete and learn from each other much more quickly than semi-annual conferences enable.

WaveNet is a new way of doing things.

WaveNet synthesizes speech in a novel way. The resulting waveform is generated one sample at a time, conditioned on the previous sample. This is essentially doing parametric speech synthesis without a vocoder. Not only does this approach work surprisingly well, it’s exciting in its newness as well. There’s other novelty in this work too (1. Using a classification approach to predict discretized mu-law values instead of predicting a continuous value 2. The dilating convolution layer.) but this work is most important for showing the promise of this approach to generating audio data.

The Microsoft ASR work is not a new way of doing things.

The work is of the highest quality, without a doubt. This paper represents a exeptionally well-engineered collection of effective speech recognition tools that have hit an interesting milestone. However, the individual pieces will seem familiar to anyone up to date on the current state-of-the-art. The major improvements come improvements to language modeling (an LSTM LM specifically), LACE units which MSR showed at Interspeech this year, lattice free MMI, and spatial smoothing (which is similar to Cambridge’s “stimulated training”). The Microsoft team has put these parts together more effectively than anyone else, and it’s an important achievement. But compared to the WaveNet development, it’s a more incremental step.

Monday, October 13, 2014

Interspeech 2014 Recap

This year's Interspeech was in Singapore. Singapore is, in some ways, a very easy venue to travel to. It's a modern, cosmopolitan city. They speak English. It's tropical, but you're never more than a hundred meters from air conditioning. In other ways, it's so far. Over 20 hours each way. I like airplanes. They're as magical as any technology we've got. But 20 hours is a long time to sit still. Think about how many steps you take in 20 hours. How many different faces you see. Then reduce that to about 200 steps, and 10 people.

"Because we are all poets or babies in the middle of the night, struggling with being." - Martin Amis "London Fields"

Interspeech 2014 was a well run conference. The quality of papers was generally quite high. The venue easily handled the size of the event and the wifi was steady. It was difficult to find enough food at the Welcome Reception, but easy to find enough beer. The banquet was flawed -- segregating vegetarians is pretty rude -- but they all are, and at least there was enough food to go around, and everyone ate promptly. And of course, there was Mambo and Jambo. I'm not going to go into it here, but find someone who attended the opening ceremony and ask them to describe it. Then don't believe them, and ask someone else to do the same. It was "odd" at best. But be sure, I'll be attending the Dresden opening ceremony to see how they one-up it.

Deep Learning
A few years ago, DNNs invaded speech conferences. Deep Learning is still a significant buzz word, and a hot topic. But for, the better of everyone involved, the intensity has cooled. Now the interest in DNNs seems to have shifted into 1) understanding how they work, and how to best train them to a task and 2) Long Short-Term Memory units to model sequential data. The latter really broke out at this years conference. There were a number of papers finding that them to be an effective alternative to traditional recurrent nets trained with back-propagation through time.

BABEL
I've been involved with the IARPA-BABEL program, so my view is pretty biased on this front, but I felt like the presence of BABEL in this year's Interspeech was particularly large. The program's central task is performing keyword search on low-resource languages. It has an aggressive evaluation schedule with an increased number of *new* languages involved each year. There were at least two sessions devoted to keyword search, and papers evaluated by either BABEL-proper or the NIST OpenKWS challenge seemed to be all over the conference. (Searching the paper index suggests that there are between 30 and 40 BABEL papers, and another 10 or so OpenKWS papers.) It seems clear that this program has had a large impact on ASR and KWS research. 2014 was certainly the high-water mark here, as the program shrunk by 50% last year, but it's worth noting its effect on the field.

Some standout papers

I don't mean to suggest that these are "the best" papers, but they're ones that caught my eye for one reason or another.

Acoustic Modeling with Deep Neural Networks Using Raw Time Signal for LVCSR by Zoltán Tüske, Pavel Golik, Ralf Schlüter and Hermann Ney. Part of the promise of Deep Learning is the ability to "learn feature representations" directly from data. This is frequently touted as a description of what is happening in the first hidden layer of a deep net. So, the logic goes, do we need MFCC/PLP/etc. features, or can we do speech recognition directly on a raw acoustic signal? This is the first paper I'm aware of to affirmatively show that "yes, yes we can". It requires a good amount of training data, and rectified linear (ReLU) neurons work better for this, but 1) it works competitively with traditional features and 2) many of the first hidden layer neurons can be shown to be learning a filterbank. Very cool.

Backoff Inspired Features for Maximum Entropy Language Models by Fadi Biadsy, Keith Hall, Pedro Moreno and Brian Roark. In n-gram language modeling, when a sequence of words A, B, C have never been observed, its n-gram probability P(C|A,B) can be approximated by the probability P(C|B). But to be sensible about it, you've got to apply a backoff penalty. Discriminative language models can seamlessly incorporate backoff features, F(A,B,C), F(B,C), etc. and learn appropriate weights. The key insight in this paper is that when a discriminative model uses these backoff estimates, it incurs no penalty. It's essentially overestimating the probability of uncommon n-grams in the context of common (n-k)-grams. This paper seeks to fix this, and does.

Additional favorites some from my students:

Word Embeddings for Speech Recognition by Samy Bengio and Georg Heigold. Far and away the most popular poster at the conference. This is high on my list to read closely. It promises to learn a euclidean space into which word decoding can happen, so that words that sound similar are closer in space.

The Obligatory Contour Principle in African and European Varieties of French by Mathieu Avanzi, Guri Bordal and Gélase Nimbona. An investigation of prosodic differences in dialects of French. Very consistent with one of my student's dissertation projects.

Improving Spoken Document Retrieval by Unsupervised Language Model Adaptation Using Utterance-Based Web Search by Robert Herms, Marc Ritter, Thomas Wilhelm-Stein, Maximilian Eibl. A clever way of handling OOVs in spoken document retrieval. Essentially the idea is if I recognize W[i-2], W[i-1], UNK, W[i+1], W[i+2], go do a web search for the context, find some matching documents, and augment the language model with them, then re-decode. Kind of like distant supervision for spoken document IR.

Learning Small-Size DNN with Output-Distribution-Based Criteria by Jinyu Li, Rui Zhao, Jui-Ting Huang, Yifan Gong. How do you effectively train a small DNN without ruining its performance? This paper out of MSR suggests training a large DNN, then using its output distribution to train the small one. It'll take me some closer reading to fully understand why this works, but I'm intrigued.

Canonical Correlation Analysis and Local Fisher Discriminant Analysis based Multi-View Acoustic Feature Reduction for Physical Load Prediction by Heysem Kaya, Tuğçe Özkaptan, Albert Ali Salah, Sadık Fikret Gürgen. We tried a number of dimensionality reduction approaches for this year's Paralinguistic Challenge, but didn't get particularly good results. These guys did using CCA and LFDA. Looking forward to reading this one as a more general feature reduction approach when the number of features are larger than the number of training instances.

Tuesday, August 12, 2014

Things I didn't know before becoming a professor: #5 How to manage my time

Grad school is an exercise in working independently.

What I mean by working independently is this: Your daily activities are barely supervised. Your long term activities are heavily scrutinized and harshly judged.

You must master this skill in order to finish a PhD. You must have a PhD to be a professor. So.... You'll be all set. Right?

This covers a lot of careers, not just academics, but being a professor provides an extremely lightly constrained schedule. On one hand, in this, grad school prepared me for the challenge of controlling my own time. On the other, it gave me a false sense of confidence. I was driving so well in the parking lot, and then I pulled into midtown traffic at rush hour. And I'm driving a go-kart.

In grad school you learn how to motivate yourself. You can read another paper, or you can watch another episode of The Sopranos. You can write up an experiment or you can play one more hour of Halo. Or you can just go to the beach. Once you're done with classes, there're no midterms coming up in a month. No papers due in two weeks. Just an uncomfortable meeting with your advisor, where you can make small talk and rehash previous discussions and results until the meeting is over and you think you've played it off. (You haven't, btw.) Most people figure this out. There is a wealth of advice about how to get off your rear and get work done. There are theories and apps and support groups and better blogs than this devoted to the subject. So I'm going to assume that even the most procrastinative of professors (my wife lovingly refers to me as an "epic time waster") have a set of tools at their disposal to handle the inevitable distractions and failings of motivation.

But here's where it goes haywire. The breadth and depth of things that need to be handled on a given day is far greater than what was expected in grad school. This isn't revelatory. Its to be expected; its much more responsibility. The diversity of responsibility is as surprising as anything else. Being a professor requires you to teach, write grants, mentor grad students, handle bureaucracies (internal to the university and externally), manage budgets, foster relationships with a broader scientific community, attend conferences, project meetings and site visits, and do some research. Its not too much; its very doable. But its much more faceted than grad student responsibilities ready you for. As a grad student, your responsibility is typically to do your research, and occasionally teach or take classes. There is relatively little necessary by way of prioritizing which of many projects and tasks need your attention.

I see the challenge in transitioning from student to professor like this: The motivational chops you honed during grad school have made your job as entertaining and rewarding as TV, video games and the beach. Now, you need to repurpose them so that the frustrating parts of the job are as entertaining and rewarding as fun parts.

Everyday I make two lists. One: things I want to do. Two: things I don't want to do. Things I want to do usually includes reading papers, and writing code, some errands, grant ideas, sometimes course planning. Things I don't want to do are typically bureaucratic. Plus I've got a rule that if I've put the same task on the "want" list for three days running, my attitude has outed itself and its actually a "don't want" kind of thing. Through the day, I go one for me, one for them. One from column a and one from column b. And everyday there are a couple of things that I don't want to do, that I've found some way to avoid doing.

I've tried a lot of approaches to bring order to the responsibilities that being a professor entails, the most basic lessons I've learned are these.

1. Write everything down. I've got a decent memory, but there's no way that I can keep my schedule and the status of each students project and other milestones and what I was thinking about yesterday at hand without a lot of notes.

2. Let go. Days are not long enough. When I go to bed, there is always something left undone. Make sure that it's not something truly important.

3. Do a little bit of soul-sucking work first thing in the morning every day. This keeps it from building up leading to dreaded days that are totally eaten up by it. Spending a full day on bureaucratic maintenance to dig out is something that (I sincerely hope) no grad student ever has to grapple with it. I treat this like the nuclear option; it's a clear sign that I've made a bunch of time management mistakes in the previous week/month.

4. Keep on top of email. Even when traveling or facing down a deadline. (See above.)

One thing I'm still trying to get my arms around is managing my time for priorities of different durations. Handling tasks for the day or week is pretty easy. Pegging against milestones like paper and grant deadlines, presentations at conferences or grant meetings, is pretty easy. It's harder for me too keep a longer perspective and give enough weight to the task that's not needed in 2 weeks, but one that might have impact in 2 months or 2 years. This medium- to long-view is harder to maintain and harder to integrate into daily priorities.

Get a little better every day.

Monday, July 07, 2014

Things I didn't know before becoming a professor: #4 Different types of collaboration

There's a romantic and resilient fantasy of the scientist toiling away in isolation. He (sad to say, it's almost always he) emerges from the lab, bleary-eyed and triumphant and shares his genius with the world.

Da Vinci, Frankenstein, Edison, Einstein, Every Evil Genius, Zuckerberg

Our science narrative has expanded to pairs of geeks in garages

Hewlett & Packard, Gates & Allen, Jobs & Wozniak, Brin & Page

but hasn't yet embraced the fundamental reality of modern science: most work is done in collaboration. Most great big ideas are the product of many great small ideas.

This is why there's no satisfying answer to questions like "who invented the atomic bomb". Was it Fermi? Einstein? Oppenheimer? Of course Al Gore didn't invent The Internet, but who did?

As a graduate student, you are, at worst, Igor, the hunchbacked assistant doing all the work and getting none of the credit. At best, you are a near-equal collaborator with your advisor. But there's a sea of collaboration waiting once you're out of the nest and on to the next step.

Since becoming a professor I've noticed a handful of types of collaboration. From intimate to remote, they each have a role to play.

Peer Collaboration
This can be the most satisfying and most productive way to work with a partner or team. You sit in the same room, or with a chat window open. You talk through ideas, big and small. You tackle the parts of the work you're most suited to, or most excited by. You debug code together. You run experiments together. Try to make sense of your shared results. You iterate over writing papers together.

It can also be the most frustrating. A group only walks as fast as its slowest member. If you don't work well with your partners, for any reason, the stress is visible almost immediately. If the partnership is strong, you figure it out, and get back to work. If it's not, you'll know pretty quickly.

Mentor-Student Collaboration
There's a similar intimacy in this collaboration, but there are some very clear differences. First the pace of collaboration is typically slower. At slowest, you may only work with your student only once a week. Even if you're in regular email contact it's only a couple of times a day. Typically it's only during the most intense periods of an evaluation or paper deadline does the collaboration reach the level of immediacy and proximity of peer collaboration.

The division of labor here is more driven by the hierarchical nature of the relationship. The mentor typically has a broader vision of the work, while the student has a better grasp of the details. (Or if they don't, it's part of the responsibility and training to learn about the details.) The mentor has the responsibility of contextualizing the work, either into a broader research agenda that will include the students thesis as an offshoot, or into a broader project or publication. At the start of the relationship, the mentor is typically responsible for guiding the research, determining which research questions are most fruitful and

Over the course of graduate study, a good mentor-student collaboration will take on more characteristics of a peer collaboration, with the student ultimately initiating most of the research ideas. But there's always going to be a level of deference that the mentor receives.

-----

As a graduate student, I had a lot of experience with the previous two types of collaboration. I even had some exposure to the next three, but so much as to be able to identify the differences, or how they impact the work.

-----

Project Collaboration

On medium-to-large multifaceted projects there are frequently multiple PIs (principal investigators) each with independent research groups. Collaboration across sites towards a common goal is different and challenging. The PIs coordinate with some frequency. There are bureaucratic reasons for this -- mostly around budgets and report writing -- but coordination between students is much less common. Each of the individual research groups have their own broad responsibilities, timeframes and agendas. Each are balancing a number of projects, student milestones (qualifying exams, etc.), and specific personnel decisions.

The biggest difference in this kind of collaboration, in my experience, is the pace. The interaction between groups when it comes to research is infrequent. Sometimes just a few times a month. Another difference is in the level of detail. When the groups share their work, it is usually at a relatively high level -- far removed from unsuccessful approaches and interesting failures. There are opportunities to get advice at a high level about directions and plans, but close collaboration about specific decisions and approaches is much less common.

This is completely understandable. The nature of this collaboration demands it. These projects are big enough to demand a large number of people working on them. This task oriented partitioning into research groups is probably the best way to manage the effort.

However, there's the time when the rubber meets the road. This is usually in the form of deliverables, where you have to send a report or code to a funder, evaluations, where all of the moving parts that the groups have developed have to work together, or presentations at site visits or PI meetings, where you have to show that you're all working towards a common goal.

Often, these moments demand closer coordination and collaboration. Having made a few mistakes. The lesson I've learned is to stop and drill down. Take meetings offline to deal with discrete questions and issues. Get the students involved in these meetings. If at all possible, visit your partners. With your students. One hour face to face can take the place of a few days worth of emails.

Social Collaboration
Still another step more removed is the kind of collaboration you have with the scientific community which you're a part of. This is the sort of collaboration where you have a conversation with someone about an idea that one of you is chewing on. You bat the idea around for a little while, maybe hear about a technique you didn't know about or hadn't thought of, and then the conversation moves on. I've found that this kind of collaboration occurs with someone you know. Either you know each other's work, or you have a mutual colleague to bring you together.

Most of the time this type one-off micro-collaboration ends with a single conversation. Sometimes it results in shared data or tools. But given the right alignment of circumstances it can foster a more serious coming together. Usually just an extended conversation over email, or a phone/skype conversation, but if fate is really on your side, this can grow into a paper or a grant proposal. But by this point, you've switched categories and you're working much more closely.

I find that this collaboration is most available during visits to other institutions and at conferences. Frankly, this kind of collaboration is often the most valuable thing I take away from conferences.

"Public" Collaboration
If you're doing it right, people are going to know about your work. People will email you with questions and comments. You will get queries about how to use a tool you wrote, or (maybe more frequently) a bug report or feature request. Through this communication you are supporting other people's research. This rarely turns into a more substantial collaboration. But if it means that people will cite your work.

Here, the collaboration is somewhat lopsided. Essentially your prior effort of writing a paper or tool are supporting someone's current research. But this needs some additional support from your current self, either in clarification or amendment. The upside is that if the system is working, you can garner this support as much as you give it out.

Here, I find the challenge is to appropriately prioritize these interactions. More frequently than I'd like to admit, these are the emails that filter to the bottom of my inbox.

If you're reading this, and I still haven't gotten back to you, I'm sorry. I know these interactions are important. I'm working on it.

Wednesday, May 21, 2014

Things I didn't know before becoming a professor: #3 How to run a research group

At this point, I've figured out my classes and secured a little bit of funding. Now it's time to staff up the group and get to the work of research.

Every researcher I know keeps a list of open questions and tasks that they want to get to next. This list is a best friend, and a worst enemy. This list is a hydra. Every time an element is taken off, two more appear in its place. To tame the beast, it takes a group effort.

It's no revelation to recognize that management is different from labor. I'm sure if I had an MBA instead of (in addition to?!) a PhD, some of these issues would be more intuitive. But I didn't, and most professors don't. We're self-taught.

When the research group is firing on all cylinders, it feels like my work is having a multiplicative effect, that I'm lifting more than I possibly could alone. When it's not, it's demoralizing. I try not to get sucked into this, but there are days when it feels like I could do more in a cave with a laptop, an internet connection and coffee.

These are some of the things about running a group that I didn't know before becoming a professor.

How to organize a research agenda

At my thesis proposal, Kathy McKeown asked me the hardest question: "What do you want to be famous for?" It totally blindsided me. I laughed uncomfortably, and looked to my other committee members for acknowledgement that this was an outrageous proposition. No help was coming. I have no idea what my answer was, but it's the only question I remember from my candidacy exam, proposal or defense.

A clear answer for this question, an organizing principle for the research that your lab will do, brings clarity and identity to the group. It represents a research agenda that guides the work. A clear research agenda serves a similar role as a mission statement and business plan. It communicates your goals internally and externally.

This also helps with recruitment -- if students and postdocs know exactly what kinds of questions you are interested in, you will attract people who are interested in these questions, and people who are capable of working on them. The same principle applies to students who are looking to do independent studies, undergraduate or master's theses with you.

I've generally under-appreciated the value of this focus. Personally, I've benefitted from broad research interests. I get a lot of satisfaction by working on a lot of different things. My most cited paper wasn't even cited in my dissertation. But I think I've also suffered from this generalist approach. I may have had greater impact on the questions I find most interesting had I limited my activities on the second tier. I may have made more progress in getting "famous" for something.

How to engender community and collaboration

Being a graduate student can be isolating. There are some classes. A weekly meeting or two that you need to show up for. But most of your work is alone.

How do you take a group of students who are organized only by you, the professor, and help them communicate, support each other, and become colleagues? I certainly had no idea.

Some things I've tried:

Taking everyone to lunch. this didn't work so well. it probably would work now, but at the time the students didn't know each other well enough and it was just pretty awkward.

Shared tasks. Inviting a group of students to work together on a short term project is the single best thing I've done to get students to trust each other and exercise their basic research skills. I use the interspeech paralinguistics challenges for these. I also invite other computational linguistics students to participate from the linguistics and cs programs. It's a really fun exercise and the students get a lot out of it.

Weekly lab meetings. One student gives a talk every week. It gives a safe space for students to practice giving presentations (and giving feedback on presentations), and it makes sure that everyone knows what everyone else is working on.

How to make and manage a budget

This is my least favorite part of the job. Making a budget is fine. It's just making puzzle pieces fit. But inevitably there is some wrinkle to sticking to it. Travel costs are more or less than budgeted. At CUNY tuition is highly variable based on what year a student is and how many courses they are taking.

And then there is interacting with admins, grants, financial officers and contract lawyers. These people are invaluable. When the relationship is good, your life is easy. When there are delays it can make sticking to the timing of a budget very very very difficult.

Having an eye for detail and the patience to use it really comes into play here.

Not everyone knows what I know

The next three observations are less about specific skills, and more about the personal effects of leading a group.

As a new professor with new graduate students, you know more than any other research group member. This isn't a "smartest person in the room" thing (though sometimes it is), but an experience thing. The professor has read more papers, seen more talks, (may have executed more experiments) and knows more about the topic.

The list of things that you want to work on next are based on your previous work, your thinking and your instincts.

In order for a student to pick up and work on a task, they need to get up to speed, and fast.

It is too easy to forget that the people working with me don't know what I know.

How to delegate

It became clear to me early on that not all CS professors still program. This struck me as a scary idea. First of all, I like to program, and I'm reasonably good at it. Secondly, it seems like distance between a professor and the nuts and bolts of research is a dangerous thing.

I was talking with Dan Ellis shortly after I had realized this. (Nevermind that Dan is a EE professor.) I asked him if he still coded himself. He had just finished a project over the summer and was happy with it and the process, but then he described it as "indulgent". why "indulgent"? because, according to Dan, he should be working on writing grants, papers and reports, executing the broad vision of the lab, while the students benefit from the programming experience. (Dan, if you read this, please forgive mistakes by paraphrase, and the fog of memory.)

The responsibility to delegate work to students is one that I haven't quite internalized. I still run a lot of experiments myself. I write a good deal of code still. And when we're under close deadlines, I have a tendency to tell students to walk away, and handle the final push myself. I'm not yet prepared to call this "indulgent", but it I'm ready to acknowledge it as a bad habit.

How to be a mentor

Internalizing the role of "mentor" is much more difficult than "teacher". Teaching is something that you do a few times a week, but it's a hat you take off. Mentoring, or advising graduate students, is an ongoing process.

I'm the most visible example of what a professor is for my students. How I do this job impacts how they will. (Just as my advisor is my first and best example that I draw from when advising students.)

My behavior as a mentor impacts how they'll behave as students. If I respond to emails at 1am, I will get emails at 1am. If I don't communicate what I expect from them, they won't know.

In addition to my advisor, I've had a number of important mentors. There are two lessons I've taken from those relationships.

Protect your advisees. There are plenty of external challenges -- defenses, negative paper reviews, job postings, evaluations -- that a student will participate in. They do most of the work. I feel like it's my job to make sure they're set up for success. (Only submit papers that you would accept. Don't volunteer them for too many responsibilities.) When things blow up, absorb the blow. Outwardly, take responsibility for mistakes, figure it out with the student internally. If the students feel safe, they'll be enabled to do good work.

Be a happy warrior. There are parts of the job that are difficult. It takes long hours and drive. And you're in this job for a reason. You want to discover, and share those discoveries. You want to contribute to Knowledge. As a student, I fed on the enthusiasm of my mentors. While it's easy and appropriate to feel a kinship with students, I feel a responsibility to set a tone that this is valuable, but also this is fun. And it is.

Monday, May 05, 2014

Things I didn't know before becoming a professor: #2 How to Write a Grant

To get tenure in computer science, you have to secure external funding.

I don't know how important this is in other disciplines. I'm sure there are administrations that will say that external funding is only one of a number of criteria that determine tenure decisions. I'm sure they're right. But without funding, you're at a severe disadvantage. You can't work with grad students (as easily -- some programs provide student funding through other means). Travel to conferences -- where you present your work and solidify relationships with important people in the field (who may end up writing letters in support of your promotion) -- is expensive. And then there's the brass tacks. Depending on institution and some uninteresting details, the administration receives somewhere in the ballpark of 1/3 to 1/2 of the funds that you secure.

Before becoming a professor, I hadn't written a grant proposal. I had read a couple of successful proposals that I had worked under as a student, but I hadn't been a part of the preparation of any.

A little bit of context before I share what little I've learned about writing grants.

I've been pretty successful in securing funding over the last five years.

I don't expect this trend to continue forever.

I recently received an NSF CAREER award.

But my previous two proposals were not funded.

I'm hardly an expert, but I've gotten better since starting.

Here are a few things I didn't know

How many funding opportunities there are?
I knew about NSF and NIH. I knew about DARPA because I worked on a grant as a student. I didn't know how many programs NSF has (there's no way I know about all of them). I didn't know about the other DoD agencies that fund basic research. I still don't know about all of the foundations and industrial grants and awards that are available. I'm sure I'm still missing some, but each of IBM, Google, Microsoft offer programs to support faculty. On top of that, there are institutional awards -- CUNY has a handful of them, and if we do, I'm sure everyone else does. This doesn't mean that it's easy to find funding. But by keeping an open eye, asking around, and being lucky enough to be asked to join grant writing efforts, I've been surprised by how many opportunities are available (in this field).

How to tell the difference between a good proposal and a not-good-enough proposal
Friends and colleagues will offer to share successful grants with you. Take them up on it. But learning from only positive examples is challenging. It can be challenging to figure out how to translate structure and presentation cues from one grant to another. Also, it's not always clear what made these examples impress the reviewers. If you're so bold, ask these people to share the reviews along with the grant. I haven't tried this, and I'd have to trust a person quite a bit to share my highlighted flaws with them, but give it a shot.

Still better is getting to look at unsuccessful proposals and their associated reviews. Everyone you know has one or two of these kicking around. Funding rates are pretty low, say 20% or so, so most people will have a stack of failures for each success they have. But again, it takes moxie to ask someone for this, and a lot of trust to offer it. (I don't know if this happens ever, but I'd love to hear if other junior faculty were able to read unfunded proposals as they started writing themselves.)

The best way to see a lot of examples of grants, both positive and negative, and their reviews is to get on a grant reviewing panel. NSF Program Managers like to include junior faculty in their panels. Call one who funds work like yours, and have a conversation. I was invited to one in Spring 2010. It was an opportunity to closely read about 5-6 proposals, and more casually review another 12 or so. Seeing all the ways a basically good idea could not get funded was eye-opening. Generally, the quality is high, so the differentiation between successful and unsuccessful proposals can be the depth of their weaknesses rather than the height of their strengths.

That writing a grant is writing science fiction
It's believable, near-future science fiction, but still in the genre. You're describing something that doesn't exist, but will in the next few years.

This perspective has made the process of grant a lot more entertaining.

Science writing has a narrative arc. The introduction and motivation of your work should be exciting. The feeling I want to leave a reader with is somewhere between, "Of course that's what will happen", and "Wouldn't it be cool if that's what the world was like." Balancing these two extremes is the difference between the Scylla and Charybdis of incrementality ("boring") and overreaching ("i don't believe you can do this").

How to keep it together after getting a grant rejected
It's only natural to be offended, embarrassed, insulted, frustrated, and sad when a proposal that you spent months of your life on is not funded. The panel was clearly full of idiots who didn't understand your genius. There was some mistake, if only they had read more closely they would understand how important this work is.

Here's what rejection has taught me:

Take a deep breath.

Most proposals aren't funded.

The folks who review your grants are almost universally qualified to do so.

If a proposal is funded, you are not a genius. If a proposal is not funded, you are not an idiot.

The sooner you get over it, the better. Remember, it's the work that's being reviewed, those crammed 15 pages of science fiction, not your identity.

And there's a silver lining: almost always they will include a stack of written reviews. These reviews are a gold mine. Treat them as a genuinely helpful consolation prize. If you don't understand what they're saying, you can probably talk to a program manager about them.

When revising a proposal, I take all of the reviews, and cut out everything positive that they had to say. It's too easy to be comforted by that. What I really need to know is what didn't work. That's what needs work.

How to find writing advice
I still don't have a lot of confidence in my grant writing. But there are a lot of people who do, who will share their advice with you. Almost all institutions have grant writing workshops. NSF hosts workshops. There are posts like this, and this, and this, written by people who are truly qualified about the process to really steer you right. Writing with a group of people helps. Collaborative proposals come with their own logistical challenges, but it can be helpful for getting feedback on the writing.

Some graduate students write applications for fellowships, internships and other sources of funding. This is great practice, but this is different from the grant process. Maybe the best way for students to get exposure and experience at grant writing is through their advisor's proposals. I think some professors work with their students on their proposals. If the student is a strong enough writer, and far along in their dissertation, I could see this making a lot of sense for both the professor and student.

Sunday, April 27, 2014

Things I didn't know before becoming a professor: #1 How to Teach

In order to be a professor* you need a PhD. The two responsibilities you have as a professor are to do good research and to teach well.

To get a PhD, you need to do good research, so you're well equipped to handle this.

To get my PhD, I did not have to teach. I had TA'd a few times. But I had put in 10 years in college/grad school. I had taken countless classes, some with great teachers, some less great. So I figured no big deal; I can teach.

I've seen good movies. I've got a good story. I can make a good movie.

These are some things I didn't know about teaching before becoming a professor.

You are television.
I learned this from Michael Cirino in a very different context.

For the time that you are in front of students, you are television. You are putting on a show. If your students don't engage with you, your students won't learn anything beyond what they get in a book. That's not to say that you're entertainment, but you are performing.

My act isn't strong yet, but it's getting better.

Don't skimp on the basics.

In an early version of a Machine Learning class, I decided that I wanted a lecture on spectral clustering. It was a topic I was interested in, but didn't have a ton of experience with. I read a lot. I worked out math. By the time I had a lecture ready, I was feeling good about it. I went to class, delivered my A material....blank stares. It was a total dud.

It wasn't that the lecture was bad, but I hadn't earned it. Because I wanted to get to a topic that was exciting to me, I had rushed through or omitted a lot of background material. I had completely set myself up for failure. When I went back to figure out where I had went wrong, I realized it was months earlier, when I was planning the syllabus.

Every time I teach a new class, I tell myself that I'm going to have half a dozen or so lectures ready to go by the first day of class. Sometimes I hit this number, usually I don't. But I've come to realize that lectures are a lot easier if you have a clear structure ready for the class. Lately, I've become less concerned about having complete lectures prepared. Instead of writing a few complete lectures, I try to have a fairly detailed outline of every class. This helps me focus on the big picture.

When I have a clear direction for how the pieces fit together, the course is better. Even if that means I don't get to the most exciting material, or have an excuse to learn something new.

Preparing a lecture takes an unbelievable amount of time.
Truly unbelievable. In my first semester, it took 8-10 hours to prepare each 75 minute lecture. Add in writing assignments and exams, grading, office hours, and 2.5 hours in front of students. Thankfully, I was only teaching one class. But at CUNY a full load is 3 classes in a semester.

Most of your time is spent with students who are struggling.
"Do you learn from your students?" No. I teach them. "Are you inspired by your students?"

My PhD students are great. They bring some exciting ideas and papers, and there's a collaborative learning that happens there. They inspire me and drive me. Absolutely.

Students in class, I have a very limited and lopsided relationship with. I don't think I've ever had an office hours meeting with a student that took material from class and took it a step further. It's almost always something to the effect of going over material from the previous lecture or exercise in more detail. There's immense satisfaction from guiding someone to understanding challenging material, but it takes a lot of time.

Students cheat. A lot.
I have had a student cheat in almost every course I've taught. (This isn't unique to CUNY. Ask around.)

The cheating meeting is the most emotional human experience, I've had with anyone other than a family member or romantic partner.

The student usually cries.

The student usually tries to negotiate out of the repercussions.

Denial? Not so much. Most people own up to it pretty quickly. Forceful denial is a red flag for me. Be open to the possibility that you made a mistake. Maybe one party knew about the cheating and the other didn't. Maybe the similarities between two assignments really were random.

I've had over a dozen of these conversations. Here's my best advice: Be prepared. Be able to clearly explain how you know cheating happened. Be able to point to your syllabus and university student handbook about the penalties for cheating. Absolutely document everything you can about the exchange. Send the student an email after the fact, recapping the major points. Expect that the student will scramble for an out -- some way to lessen the impact -- don't let them. At this point, I have a loose script. It's almost as formulaic as a five-paragraph essay.

A. It's clear to me that you cheated on this assignment/exam.

B. Here's how I know. (it's about here where they usually admit to it.)

C. Because of this you will be getting a zero on the assignment/failing the class/getting expelled, and I will be send a letter with this information to the department. (Or whatever your policy is.)

Put it in writing. Get it in writing.
Almost all I knew about teaching I learned from a video (VHS!) I was shown while at Columbia. This was an impromptu sharing, it seemed like it was a tape that was passed around the CS department and shown by professors to their grad students. It was a lecture by John Kender called something like "How to Teach". It was fantastic. It's similar to this video on iTunes. Before you teach, watch it.

The most significant lesson I remember from this lecture was that a syllabus is a contract, an assignment is a contract, an exam is a contract. It is your responsibility to outline the terms of this contract as clearly as you can. This is what you will learn from this class. This is what to expect from this course, assignment, exam. If you do this, you will get this grade.

Most of the difficulty I have had with students and disputes can be traced back to not being rock solid in the language used in a syllabus, or on an assignment, or (and this was a surprise) not putting things that were discussed in a meeting, in writing.

If you make an arrangement outside your syllabus with a student around any element of your course, shoot them an email after the meeting recapping what was discussed. I didn't know this before teaching, but wish I had.

What now?
I've definitely become a better teacher over the last five years. I've made a lot of mistakes and I still do.

Each time I repeat a course, I toy with its structure. My boiler plate syllabus has gotten tighter.

The broader point is when I started, I was starting cold.

There must be ways, programs, seminars, etc. that aim to teach people how to teach. I was barely exposed to any as a graduate student; I know many of my peers weren't either. Very little was available before I was in front of students for the first time.

It's easy to gripe about a system that left me unprepared, but now i'm on the other side of the equation. I have graduate students, some of whom will go on to be professors. What can I (and my institution) do to make sure they have the skills to be good teachers when they land tenure-track jobs (which of course they all will).

Practice. Practice. Practice.
For me, becoming a better teacher has taken practice. I think that's the only way to learn how you are going to teach. And there aren't enough opportunities to practice this. (I've taught Algorithms 4 times, and Machine Learning 3. That may sound like a lot, but it's only 2 or 3 opportunities to revise structure, lectures and graded material.)

At CUNY, graduate students teach a lot**, so many of my students will have experience in front of students. The downside to this is 1) teaching takes a lot of time. This means that they're not focusing on their research. and 2) Often they're not given the responsibility/opportunity to design the class themselves. Instead they teach a section of a larger course with a fixed set of assignments and exams. This leaves students with plenty of experience lecturing, and leading discussions, but less experience with the mechanics of running a course (which is where your teaching lives or dies).

I think one solution might be to have graduate students prepare and teach mini-courses, complete with syllabus, and graded assignments. These should be short, maybe 6 or fewer meetings over a month or so. This keeps the workload more manageable compared to teaching a full course. But it would allow students to practice structuring material, and writing homeworks and exams. They shouldn't be offered during regular course periods, but in summer or between terms.

I think the best approach would be for this mini-course to be on the student's dissertation topic. First of all, they'll already know a ton about it. Second, if they go on to a tenure-track job, chances are they'll have an opportunity to reuse some of these lectures, either in a(nother) course of their own or at a conference tutorial. Third, lecturing on a topic, and fielding questions can bring to light all the things you don't know or are unsure of. But the practice would be useful even if it was on some other topic.

The biggest problem I see with this idea is getting the incentives right. To teach something like this takes a lot of work, and there's little reward. Moreover, there's little incentive for other students to take one of these mini-courses (and to do the homeworks/assignments). MIT has a thriving IAP program with a ton of activities and minicourses ranging from the technical (some for credit) to the slightly absurd to one of my favorite things. The IAP is well established in the MIT culture. Can something similar be started up elsewhere?

There's no way to do this through the university registrar without a lot of bureaucracy. However, if a department, or division, were to unofficially "bless" this kind of activity by 1) including a list of course offerings, 2) document who taught what when, and 3) conferring completion "certificates" (and 4) finding teaching space), the publicity of a program like this could encourage students to participate on both sides.

There are a lot of reasons that a program like this would fail to get off the ground, but if there were a mechanism for graduate students to get practice running courses in a relatively low-risk environment, I am confident that they would be better prepared for tenure-track positions.

I would have been.

* tenure-track
** maybe too much, but that's a different discussion

Spoken Language Processing