How we built an AI code completion tool that will more than double developer productivity

An Interview with Tabnine’s CEO and Co-Founder, Dror Weiss

This week we chatted with Dror Weiss, the CEO and co-founder of Tabnine. Tabnine is an 🔥 incredible 🔥 AI code completion tool for programmers, and one that I’ve personally used to drastically improve my own productivity. Buckle up and enjoy a great interview with Dror to learn about how they launched the most popular AI tool for developers.

“AI Assistant for Developers. Use AI to ship better software faster.”

What does your company do?

Tabnine is an AI assistant that leverages technology to help software developers become better and work smarter. Our pioneering product is an AI assistant that provides AI code completions, so that developers get code suggestions as they type, with the average developer having 30% of their code being written by Tabnine.

How did the company start?

Tabnine (formerly Codota) was founded by myself [Dror Weiss (CEO)] and Eran Yahav (CTO) in 2017. Based on our previous work on code analysis and simulation, we realized that with the vast amount of commonality and standard patterns in code, it was inevitable that AI will be a critical part of the dev process. We set out and pioneered the AI code assistant category.

Who are the company’s competitors and what makes Tabnine different?

The AI code assistance market is dominated by two players: Tabnine and Copilot by Microsoft. The technical approach of the two products is vastly different. Microsoft relies on a single huge monolithic AI model that can only be hosted by Microsoft. We favor the flexibility and agility that come with smaller, code-native AI models, each trained from the ground up on a specific language or area.

We currently have more than a dozen such models available for all popular languages and also community models trained by ecosystem partners. This gives customers the flexibility to run Tabnine either on our cloud or on their network and the ability to train custom AI models that capture the specific patterns in their repositories. Tabnine provides suggestions on every keystroke and also full line or function suggestions whereas Copilot is limited to providing suggestions only on new lines as inference cost and latency is much higher.

Python completions

What are some of the most interesting problems you’re solving?

It used to be that Tabnine gave small and frequent suggestions and Copilot large blocks. Tabnine now does both and is entirely unique.

There is a big architectural difference. Copilot relies on Codex and they believe in one model to rule them all. We decouple the code completion/product from the model. You can run Tabnine with any compatible model, and ours are much smaller and more agile. We built all of this from scratch. It’s not just one model but many models for different situations. You can train a model of your own on private code. It’s impossible to even run Codex by yourself, so Copilot cannot operate locally. However, with our new lighter models, you can run them locally on your computer or a company server. You don’t just use the model, you control it in every way.

You can even train your own model since it has been decoupled. So there can be community build models, and we can all share them.

What were the challenges for creating this new approach?

Every component of the AI stack required a lot of work. A new training pipeline, new ways to process the code before sending it to the model, and new inference mechanics, just to name a few. Then on top of that, we needed the process to plug the product on top of AI backends — this decoupling took quite a bit of work. We no longer use models that are pre-trained on text and tune them on code. We take empty models and train them from scratch on code. It leverages the entire learning capacity and trains it on code instead of only a small percentage. For example, if you take some Python code, it understands the fundamentals of the code itself.

What will the world look like once your company achieves its vision?

We believe the future of software development will be a combination of developers and AI. Neither will dominate, and we’ll get the right combination between human intelligence and artificial intelligence. The human gives the direction, and the AI fills out the details. Tabnine is accelerating developers 30–40%, and we believe this will continue to grow as the technology gets better.

We used to think we could double developer productivity, but we now think the ceiling is higher.

We believe that every developer will use AI. We believe that every organization will use AI as part of their stack. Just like organizations use source control and continuous integration, AI will be an integral part in how company employees will code better together effectively. AI will fill in the gaps that are tribal knowledge. It will make sure everyone writes the exact same way, effortlessly. The AI will know the patterns and styles used to ensure consistency and high code quality. It’s not about just being faster, but it’s about being a better developer and consistent with how the organization writes code.

The final part of our vision is to be a place where other parties can train their AI. Currently mostly what we deliver is based on our AI and what it thinks is good code. However, if we want to capture a community or great individual developers, we can be a platform that enables others to write and publish best practices and enforce consistency.

companies <-> developer <-> ecosystems/communities

Mongoose code completions

What technology stack do you use, and why did you choose this stack?

Overall Tabnine is written in Rust. I was worried it would be difficult, but the team is really digging it. They love the elegance and performance. In addition to Rust, we use Python.

Do you lose productivity by using a low-level language? There is always price and tradeoff. There is a learning curve relative to something like Python. Not people know Rust yet, so it can take a few months. The tooling isn’t quite there yet, and build times can be a bit longer. There is no free lunch, but we love the benefits we do get.

Why did you choose Rust? The primary reason we chose Rust is the speed and the safety, but there is more to it than that. The models we run on developers’ local machines, there is almost no way to do it without running them at a very low level. To do the predictions we make on a local developer machine, we have to run it on the bare metal.

What is your team like?

We currently have 30 employees in the US and Israel with plans to grow the team to over 40 people by the end of the year.

I imagine it takes of lot of brain power to build the AI models — how big is your research team? Our research team is surprisingly small. We have a flat team overall, and no one is dedicated to any single role. We have great people who wear multiple hats. For our new code generation, we have 7-8 people who worked on our new code generation tools. By being lean, we’re able to move fast.

What makes your company unique?

Our team loves the mission and the problems they’re solving. There are many tradeoffs. They are users of products that they build in Tabnine. There isn’t a single mastermind, and everyone has ownership and the ability to contribute. It’s a ton of fun for people who really believe in the problem domain.

Any other interesting stats or uses of AI and machine learning worth highlighting that aren’t mentioned in the release, particularly that might have an application in the enterprise? Please discuss things like the datasets used to train various models, how potential bias in these models was mitigated, etc.

  • The importance of data when building trusted, secure models cannot be overstated. Tabnine made the decision early on to only train on fully permissive code (e.g., MIT and Apache 2.0 licenses) so that our customers could trust in the output of the AI.
  • We have built the unique ability to train and secure custom models based on customer’s proprietary code for use by only their developers.
  • Our models can run on Tabnine Cloud, as well as in the customers VPC.
  • More technically, our code-native models (featured in our newly released next-generation platform) vastly outperform our previous GPT based models as they train exclusively on code from a programming language, so the primitives the model learns are those that fit that specific dataset. Moreover, the entire learning capacity of these models is devoted to learning the regularities and patterns in code, yielding to much better performance than textual models fine-tuned on code.
  • We have re-architected the platform so it can use completely new models built by Tabnine and partners. Our next generation models are code-native and much more powerful than those used in earlier versions of Tabnine. As new research comes out from companies like Meta and SFDC, we are also able to build on that and deliver these models to our developers in code-time.

Which are the major new product features/items of corporate news you’d like to highlight since the last funding round?

  • New models (as mentioned earlier) trained for Python, JavaScript, TypeScript, Java, Ruby, and 7 others.
  • Tabnine private model training — integrations with GitLab, BitBucket, and GitHub.
  • Tabnine community models.
  • Tabnine Enterprise plan.
  • Strong user growth
  • Enterprise customers interest in private models for their teams of developers

What was it like transitioning from writing code to being a CEO?

I’ve been a software developer since high school in the early 90’s. I grew the standard career path from developer up to team lead and managing organizations. Before founding Tabnine, I transitioned to a product role which gave me a broader perspective. In Tabnine, I started very hands-on again and it was terrific. As we’ve grown, I’ve transitioned into CEO and it’s been a learning process. You don’t have all the answers on day 1, and it’s best to get good mentors. You try to get better every month. Transitioning to CEO in a domain that I lived in made it easier, and I wouldn’t do it in a domain outside what I know. I already understood the domain and problem, so the challenge was biggest challenge was learning to be a CEO.

Where can we go to learn more?

Visit Tabnine to learn more

Level Up Coding

Level Up is a community of 3 million monthly developers (learn more and follow us!). We also work with the best startups and most innovative tech companies 🔥

  • Are you a developer? Have the best companies reach out to you by joining the Level Up talent collective ➡️ Join Talent Collective as a Dev
  • Are you a company looking for developers? Hire FAANG-caliber engineers from the Level Up community ➡️ Hire Engineers
  • Do you want to be interviewed to share your company? Fill out our form to be interviewed and share your company with the readers of Level Up ➡️ Interview Request Form

We also provide free tools for developers to grow their careers:

Follow us on Twitter and LinkedIn


How we built an AI code completion tool that will more than double developer productivity was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.


This content originally appeared on Level Up Coding - Medium and was authored by Trey Huffine

An Interview with Tabnine’s CEO and Co-Founder, Dror Weiss

This week we chatted with Dror Weiss, the CEO and co-founder of Tabnine. Tabnine is an 🔥 incredible 🔥 AI code completion tool for programmers, and one that I’ve personally used to drastically improve my own productivity. Buckle up and enjoy a great interview with Dror to learn about how they launched the most popular AI tool for developers.

“AI Assistant for Developers. Use AI to ship better software faster.”

What does your company do?

Tabnine is an AI assistant that leverages technology to help software developers become better and work smarter. Our pioneering product is an AI assistant that provides AI code completions, so that developers get code suggestions as they type, with the average developer having 30% of their code being written by Tabnine.

How did the company start?

Tabnine (formerly Codota) was founded by myself [Dror Weiss (CEO)] and Eran Yahav (CTO) in 2017. Based on our previous work on code analysis and simulation, we realized that with the vast amount of commonality and standard patterns in code, it was inevitable that AI will be a critical part of the dev process. We set out and pioneered the AI code assistant category.

Who are the company’s competitors and what makes Tabnine different?

The AI code assistance market is dominated by two players: Tabnine and Copilot by Microsoft. The technical approach of the two products is vastly different. Microsoft relies on a single huge monolithic AI model that can only be hosted by Microsoft. We favor the flexibility and agility that come with smaller, code-native AI models, each trained from the ground up on a specific language or area.

We currently have more than a dozen such models available for all popular languages and also community models trained by ecosystem partners. This gives customers the flexibility to run Tabnine either on our cloud or on their network and the ability to train custom AI models that capture the specific patterns in their repositories. Tabnine provides suggestions on every keystroke and also full line or function suggestions whereas Copilot is limited to providing suggestions only on new lines as inference cost and latency is much higher.

Python completions

What are some of the most interesting problems you’re solving?

It used to be that Tabnine gave small and frequent suggestions and Copilot large blocks. Tabnine now does both and is entirely unique.

There is a big architectural difference. Copilot relies on Codex and they believe in one model to rule them all. We decouple the code completion/product from the model. You can run Tabnine with any compatible model, and ours are much smaller and more agile. We built all of this from scratch. It’s not just one model but many models for different situations. You can train a model of your own on private code. It’s impossible to even run Codex by yourself, so Copilot cannot operate locally. However, with our new lighter models, you can run them locally on your computer or a company server. You don’t just use the model, you control it in every way.

You can even train your own model since it has been decoupled. So there can be community build models, and we can all share them.

What were the challenges for creating this new approach?

Every component of the AI stack required a lot of work. A new training pipeline, new ways to process the code before sending it to the model, and new inference mechanics, just to name a few. Then on top of that, we needed the process to plug the product on top of AI backends — this decoupling took quite a bit of work. We no longer use models that are pre-trained on text and tune them on code. We take empty models and train them from scratch on code. It leverages the entire learning capacity and trains it on code instead of only a small percentage. For example, if you take some Python code, it understands the fundamentals of the code itself.

What will the world look like once your company achieves its vision?

We believe the future of software development will be a combination of developers and AI. Neither will dominate, and we’ll get the right combination between human intelligence and artificial intelligence. The human gives the direction, and the AI fills out the details. Tabnine is accelerating developers 30–40%, and we believe this will continue to grow as the technology gets better.

We used to think we could double developer productivity, but we now think the ceiling is higher.

We believe that every developer will use AI. We believe that every organization will use AI as part of their stack. Just like organizations use source control and continuous integration, AI will be an integral part in how company employees will code better together effectively. AI will fill in the gaps that are tribal knowledge. It will make sure everyone writes the exact same way, effortlessly. The AI will know the patterns and styles used to ensure consistency and high code quality. It’s not about just being faster, but it’s about being a better developer and consistent with how the organization writes code.

The final part of our vision is to be a place where other parties can train their AI. Currently mostly what we deliver is based on our AI and what it thinks is good code. However, if we want to capture a community or great individual developers, we can be a platform that enables others to write and publish best practices and enforce consistency.

companies <-> developer <-> ecosystems/communities

Mongoose code completions

What technology stack do you use, and why did you choose this stack?

Overall Tabnine is written in Rust. I was worried it would be difficult, but the team is really digging it. They love the elegance and performance. In addition to Rust, we use Python.

Do you lose productivity by using a low-level language? There is always price and tradeoff. There is a learning curve relative to something like Python. Not people know Rust yet, so it can take a few months. The tooling isn’t quite there yet, and build times can be a bit longer. There is no free lunch, but we love the benefits we do get.

Why did you choose Rust? The primary reason we chose Rust is the speed and the safety, but there is more to it than that. The models we run on developers' local machines, there is almost no way to do it without running them at a very low level. To do the predictions we make on a local developer machine, we have to run it on the bare metal.

What is your team like?

We currently have 30 employees in the US and Israel with plans to grow the team to over 40 people by the end of the year.

I imagine it takes of lot of brain power to build the AI models — how big is your research team? Our research team is surprisingly small. We have a flat team overall, and no one is dedicated to any single role. We have great people who wear multiple hats. For our new code generation, we have 7-8 people who worked on our new code generation tools. By being lean, we’re able to move fast.

What makes your company unique?

Our team loves the mission and the problems they’re solving. There are many tradeoffs. They are users of products that they build in Tabnine. There isn’t a single mastermind, and everyone has ownership and the ability to contribute. It’s a ton of fun for people who really believe in the problem domain.

Any other interesting stats or uses of AI and machine learning worth highlighting that aren’t mentioned in the release, particularly that might have an application in the enterprise? Please discuss things like the datasets used to train various models, how potential bias in these models was mitigated, etc.

  • The importance of data when building trusted, secure models cannot be overstated. Tabnine made the decision early on to only train on fully permissive code (e.g., MIT and Apache 2.0 licenses) so that our customers could trust in the output of the AI.
  • We have built the unique ability to train and secure custom models based on customer’s proprietary code for use by only their developers.
  • Our models can run on Tabnine Cloud, as well as in the customers VPC.
  • More technically, our code-native models (featured in our newly released next-generation platform) vastly outperform our previous GPT based models as they train exclusively on code from a programming language, so the primitives the model learns are those that fit that specific dataset. Moreover, the entire learning capacity of these models is devoted to learning the regularities and patterns in code, yielding to much better performance than textual models fine-tuned on code.
  • We have re-architected the platform so it can use completely new models built by Tabnine and partners. Our next generation models are code-native and much more powerful than those used in earlier versions of Tabnine. As new research comes out from companies like Meta and SFDC, we are also able to build on that and deliver these models to our developers in code-time.

Which are the major new product features/items of corporate news you’d like to highlight since the last funding round?

  • New models (as mentioned earlier) trained for Python, JavaScript, TypeScript, Java, Ruby, and 7 others.
  • Tabnine private model training — integrations with GitLab, BitBucket, and GitHub.
  • Tabnine community models.
  • Tabnine Enterprise plan.
  • Strong user growth
  • Enterprise customers interest in private models for their teams of developers

What was it like transitioning from writing code to being a CEO?

I’ve been a software developer since high school in the early 90’s. I grew the standard career path from developer up to team lead and managing organizations. Before founding Tabnine, I transitioned to a product role which gave me a broader perspective. In Tabnine, I started very hands-on again and it was terrific. As we’ve grown, I’ve transitioned into CEO and it’s been a learning process. You don’t have all the answers on day 1, and it’s best to get good mentors. You try to get better every month. Transitioning to CEO in a domain that I lived in made it easier, and I wouldn’t do it in a domain outside what I know. I already understood the domain and problem, so the challenge was biggest challenge was learning to be a CEO.

Where can we go to learn more?

Visit Tabnine to learn more

Level Up Coding

Level Up is a community of 3 million monthly developers (learn more and follow us!). We also work with the best startups and most innovative tech companies 🔥

  • Are you a developer? Have the best companies reach out to you by joining the Level Up talent collective ➡️ Join Talent Collective as a Dev
  • Are you a company looking for developers? Hire FAANG-caliber engineers from the Level Up community ➡️ Hire Engineers
  • Do you want to be interviewed to share your company? Fill out our form to be interviewed and share your company with the readers of Level Up ➡️ Interview Request Form

We also provide free tools for developers to grow their careers:

Follow us on Twitter and LinkedIn


How we built an AI code completion tool that will more than double developer productivity was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.


This content originally appeared on Level Up Coding - Medium and was authored by Trey Huffine


Print Share Comment Cite Upload Translate Updates
APA

Trey Huffine | Sciencx (2022-06-15T12:08:37+00:00) How we built an AI code completion tool that will more than double developer productivity. Retrieved from https://www.scien.cx/2022/06/15/how-we-built-an-ai-code-completion-tool-that-will-more-than-double-developer-productivity/

MLA
" » How we built an AI code completion tool that will more than double developer productivity." Trey Huffine | Sciencx - Wednesday June 15, 2022, https://www.scien.cx/2022/06/15/how-we-built-an-ai-code-completion-tool-that-will-more-than-double-developer-productivity/
HARVARD
Trey Huffine | Sciencx Wednesday June 15, 2022 » How we built an AI code completion tool that will more than double developer productivity., viewed ,<https://www.scien.cx/2022/06/15/how-we-built-an-ai-code-completion-tool-that-will-more-than-double-developer-productivity/>
VANCOUVER
Trey Huffine | Sciencx - » How we built an AI code completion tool that will more than double developer productivity. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2022/06/15/how-we-built-an-ai-code-completion-tool-that-will-more-than-double-developer-productivity/
CHICAGO
" » How we built an AI code completion tool that will more than double developer productivity." Trey Huffine | Sciencx - Accessed . https://www.scien.cx/2022/06/15/how-we-built-an-ai-code-completion-tool-that-will-more-than-double-developer-productivity/
IEEE
" » How we built an AI code completion tool that will more than double developer productivity." Trey Huffine | Sciencx [Online]. Available: https://www.scien.cx/2022/06/15/how-we-built-an-ai-code-completion-tool-that-will-more-than-double-developer-productivity/. [Accessed: ]
rf:citation
» How we built an AI code completion tool that will more than double developer productivity | Trey Huffine | Sciencx | https://www.scien.cx/2022/06/15/how-we-built-an-ai-code-completion-tool-that-will-more-than-double-developer-productivity/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.