In 1948, the mathematical engineer Claude Shannon published a paper that showed mathematically that it should be possible to transmit information reliably at a high rate even in a noisy system. And with that, the field of information theory was born.
Caltech's newest assistant professor of electrical engineering, Victoria Kostina, works in this field. She comes to Caltech's Division of Engineering and Applied Science from Princeton University, where she completed her PhD and a postdoctoral position in electrical engineering. Prior to that, she earned her master's in 2006 from the University of Ottawa and her undergraduate degree in applied mathematics and physics in 2004 from the Moscow Institute of Physics and Technology.
We sat down with Kostina to talk about communication systems and her work.
Can you tell us a little bit about your field?
Information theory is about the transmission of information. It studies, in a very abstract way, elements in all communication systems. And a communication system is something very general—it is any system with a transmitter, a receiver, a communication channel, and information that needs to be relayed. So, for example, as we are talking right now, we represent a communication system: I am trying to transmit my information to you, the receiver, through the air, which is our communication channel.
In his famous 1948 paper, Claude Shannon, the father of this field, asked, "What is achievable in a communication system when you have noise?" So imagine if we were standing in opposite corners of a large room filled with people talking amongst themselves, and I needed to talk to you, but for some reason I could not move. What could I do? I could try to yell louder to combat the noise. But yelling is very expensive because it strains your vocal chords—in other words, it can burn out the amplifiers in the system. Just increasing the power is not a good way to address the problem.
Another thing I could try to do is just repeat my message many times with a low voice, hoping that at some point the noise in the room would be low enough for my message to get to you. This is what is known as a repetition code. It is the most simple, trivial code you can think of—it just repeats the same message many, many times.
Of course, there are data transmission codes that are much more intelligent than a repetition code, and they are capable of transmitting more information much faster.
Did Shannon have an optimal solution?
What Shannon showed is that, in fact, you can transmit many bits of information per unit of time even if the noise is very, very high using intelligent coding systems. This was a remarkable insight, and at the time, people didn't know how to achieve this. Shannon made this insight based on mathematical modeling; he didn't show how to build the codes.
Information theorists like myself try to understand, for any communication system, what data transmission rate is attainable. In other words, what is the optimal data rate that we can achieve in these systems?
And do you use mathematical modeling to try to determine that?
Absolutely. This research requires tools from probability theory, from theory of random processes, etc. What we try to do is to look at the problem and extract the most relevant features to make it as simple as possible. Then we describe the problem mathematically, solve it, and then try to bring it back to the real world and see how this mathematical insight might help design real-world communication systems.
What is the specific focus of your research?
My research has to do with bridging the gap between what Shannon's theory tells us and the real world.
Shannon's theory showed us the limit of what is achievable in the case where we have all the time in the world to transmit information. But in real-world systems, especially in modern real-time applications, we cannot afford to wait forever. Anyone who has talked on Skype knows this. A delay of even a couple of seconds gets annoying very quickly.
So I'm trying to understand the fundamental limits in systems in which the delays are strictly bounded. This would ultimately inform real-world designs. Of course, there are many challenges in achieving that goal; even after you know the fundamental limit, a lot of work must be done to design codes that can attain that limit.
What are the applications of this work?
There are two points I would like to make about applications. The first is that it's important to know these limits so that the people who are working on the design of better codes know what to shoot for. Let's say they know the fundamental limit, and they're measuring the performance of an algorithm and know that they're already quite close to the limit. In that case, maybe it isn't worth spending more effort and investing into further development of that code. However, if they see there is still a big gap or room for improvement, maybe it is worth it to invest.
The second point is that as we go through our analysis of the system, we do gain insights into how to build real-world codes. We come away understanding some essential properties that a good code should have and that a coding engineer should aim for in order to attain those unsurpassable fundamental limits.
What do you find most exciting about your work?
I love that it is very basic research, very theoretical. Once we strip away all the particularities of a given problem, we are left with a mathematical model, which is timeless. Once you solve such a problem, it stays there.
But at the same time, I like that this work applies to the real world. The fact that it gives us insights into how to improve existing communication systems is a very exciting feature for me.
Do you remember how you originally got interested in these sorts of questions?
I grew up in a small town near Moscow, and both of my parents were engineers. So I think from a very early age they instilled in me this inquisitive outlook on the world. They always had very interesting, long answers to my questions like, "Why is the sky blue?" and "How do planes fly?" I think that's how I knew that I wanted to understand more about the world from a mathematical point of view.