In an effort to spotlight the engineering team at Cohesity, I’ve decided to host a series of Q&A interviews with various members of the Technical Staff. We’ll use the forum to highlight their diverse backgrounds, what they’re working on at Cohesity, and expose their immense value to the team.
Nick: Outside of work, what are some of your hobbies and passions, personally?
Abhijit: Music. I do some singing and play some instruments. There’s this instrument called [Indian] “Banjo”. I mostly play Indian pop music, Bollywood style of music. I used to play sports; Cricket, tennis… but now the body doesn’t support that.
Nick: If I snuck a peek at your resumé, what sort of high points and milestones would I see?
Secondly, prior to working at Cohesity, what are you most proud of working on?
Abhijit: There have been a few, and I believe the highest point is still to come, but one of the high points was working on Google Search. I learned a ton of stuff there. Before that, I didn’t understand Machine Learning or any of the analytics. That four years was worth its weight in gold. I was able to directly contribute and make some changes to Google Search around relevance and quality of content. You want to see the most relevant results from the highest quality source, and the funny part is those are two competing dimensions. For example, if there was an Ebola outbreak, people would always go to Wikipedia or some content farms, etc. Whereas if you bring up some World Health Organization (WHO) page, they might have the latest information on the actual outbreak and what is going on.
One of the objectives of Google Search is to bring more objective and more authoritative content, so I did a bunch of work around that. Coming up with “signals” that led people to the right kind of content that is both relevant and is of high quality.
Before that, I worked at Synopsys. That’s a completely different industry. Synopsys creates software that optimizes hardware. For example, if you’re at Intel and you want to build your next generation processor, we would write software that would optimize hardware for all sorts of performance metrics: Speed, Power, and so on.
Nick: So, Google Search and Synopsys. Very cool! What do you work on here at Cohesity? What is your core charter?
Abhijit: Well, there is no core charter, so to speak, at Cohesity. Actually, we encourage people to move around! I’ve moved around a few times. I’ve worked on Apollo, which is our MapReduce backend, which runs in the background and keeps the cluster healthy. Then I’ve also worked on Analytics some. You may remember my presentation and demo from Storage Field Day on this. I’ve also worked on encryption. Right now, I’m working on Magneto, which is our backup engine, specifically around our MSSQL backup capability.
So, a lot of people [in engineering] do that, where they move around to work on different areas.
Nick: Do you have a favorite so far of the things you’ve worked on?
Abhijit: I think the place where I might have the most impact is Analytics, because of my background, but we are at a stage where… well, I think the analytics piece is still down the road a bit. You have to get data on the cluster. The other thing is that backup admins today, they are probably not used to running analytics. It’s going to be a sort of transition for them. But I firmly believe that once they’re fully exposed to it, they won’t be able to live without it in the future.
Nick: I agree with you. I think there’s going to be a transition where it flips and becomes the ‘norm’ and there will be an expectation from the end-user perspective of a requirement once we set that bar.
What excites you about the Cohesity platform itself? Why is it different than whatever else already exists in the industry today?
Abhijit: The reason why I joined Cohesity is that it is taking a holistic approach of the entire space, and it’s not just trying to solve one specific problem. The charter is pretty broad. The vision is actually VERY clear, even though there is some mild confusion in the industry. Our founder knows what he is doing, he has done it before. In addition, I believe we have assembled a team that is capable of delivering on the vision and the promise, even though it is very broad, and once that train starts rolling, I think we have enough ammo to continue accelerating for a very, very long time.
Nick: From a pure engineering perspective, outside of the platform itself, what technology have you guys created that is game-changing within the Cohesity platform?
Abhijit: I think the way we have built OASIS (our storage OS) and the versatility of its ability to do all kinds of things. Its ability to do very frequent snapshots AND have them act as fully-hydrated. I think most people can do one of the two, either snapshot frequently or keep them hydrated, but OASIS let’s you do both and because of that it exposes a whole bunch of possibilities.
I think the other important component is our distributed key-value store, which truly scales. That’s one of the pieces that’s driving all of the scale-out architecture. Internally, we call it Scribe.
Nick: What differentiates Scribe from anything that already exists in the industry?
Abhijit: Most of the key-value stores are Eventually Consistent. Meaning, if the value of ‘A’ is 10 and one of the nodes acknowledges that write, then all the other nodes of the system could potentially return a stale value for a while, eventually becoming consistent.
Here, we are Strictly Consistent. If the system has said, “OK I acknowledge that ‘A’ is now 20,” then all nodes will return the new value of ‘A’ as 20 upon any read request .
Making a high-performance and strictly consistent NoSQL key-value store is not easy.
Nick: Outside of Cohesity, what parts of our ever-changing industry excite you the most?
Abhijit: I think the thing that excites me the most is the fact that analytics and machine learning are increasingly being used to make informed decisions day-to-day, at various levels. We’re talking about Secondary Storage here, where the role of analytics is discovery … ya know, what’s out there within my data. Even on the primary side or the business side, people are using it more and more to figure out “What are my users doing?” and getting insights into what the business needs to do differently to become increasingly successful.
I think we’re on the beginning of that wave and I think you’ll begin to see more and more penetration. At this point in time, I think, from what I’ve seen through my friends in various consulting roles, etc, is that there is a lot of buzz but not many people know how to truly do it. You’ll see people install huge Hadoop farms of 20-25 machines, but they don’t really know what to do with it because somebody “sold” them, telling them that they needed it. The funny part is that the data that needs to be analyzed can sometimes be as small as a few GB’s, and for a dataset that small, you don’t need all of that. You can actually just run a lot of it on a simple local machine when you have someone that knows what should be done.
Also, good analytics has immediate impact on the top line. If you apply these sort of data mining and learning techniques, you will see results right away. It truly gives you eyes into what’s going on within your business. The fundamental thing about data mining is that your data is huge, and you can’t just go out looking for diamonds …
Nick: …you have to know what you’re looking for. The funny part to me about data mining is, “Are you looking for diamonds? Are you looking for gold? Are you looking for coal?” and I think when it comes to data mining, and even machine learning to an extent, you have to know what you’re looking for before you go off and look for it. A lot of people will get, as you referenced, a big Hadoop cluster, move over copies of their data, and then have no idea what they’re actually mining for.
Abhijit: Exactly. How to do it, and how to get value out of it. Most of the stuff is actually simple. You know, there’s the saying “Keep it simple…” and the same thing applies to data mining and machine learning also. Most of the time, simple things will get you 80% of the way.
Nick: But is it fair to say we’re a long way from QBASIC IF-THEN-ELSE logic? [laughs] Intense algorithms that are being built today are very impressive!
To close things out here, is there anything on the consumer tech side of the world that excites you or that you play with?
Abhijit: I think the mobile revolution is something that is pretty game-changing right now, and it is manifesting in several ways. One of those ways is the myriad of devices we are using in our day-to-day lives. I’m doing more and more searches on mobile devices these days.
Then there is that whole “Internet of Things” movement. Sensors are making their way into a lot of the things we are doing, like connected home, manufacturing, etc. Many companies have various takes on what it should be, and it’s very difficult to predict where it will all go, but I think it’s going to be a game-changer. Ten years down the road, a lot of these things will find their way into our houses and lives more and more and influence how we accomplish day-to-day tasks in a completely different way than we do today.
And again, the more sensors you have, the more data that’s being generated, the more stuff you can automate, the more stuff you can mine for information.
Nick: Awesome. I’ll admit that the IoT movement geeks me out, especially around home automation use-cases.
Thank you for taking the time to chat with me, Abhijit!
Abhijit: My pleasure!
Many thanks to Abhijit for helping me kick off this series, and I look forward to spotlighting as many of the engineers within Cohesity as possible.