Categories
Entrepreneurship Technology VC

The power of datascience

TLDR;: I’m member of the board of directors of Alphacruncher, the EdTech startup building nuvolos.cloud. We’ve just raised CHF 1.5M to rock the EdTech space.

I’m a natural scientist by education. I do my best to operate on objective reasoning, rely on data, trust the interpretation of experts. A lot of that thinking goes back to my time at University/studying (NTB Buchs in Switzerland, University of Karlruhe in Germany, but also studies abroad, e.g. Harvard Business School).

When founding Tree.ly, it was clear to us that datascience is going to play an important role. We started with some local Jupyter Notebooks in Visual Studio Code (works actually pretty well), but quickly realized we need something where we can collaborate within the team and with others.

Analyzing Forests
Image Credits: Tree.ly, Ocell.io, illwerkevkw, TU Vienna

We continued with tools like Amazon SageMaker, Microsoft Notebooks, but also Google Colaboratory. Not to forget to run JupyterHub within our Kubernetes infrastructure. It’s amazing to see how easy everybody can do Data Science nowadays! I strongly encourage to check out these tools.


Only until my friend Oliver from Zeughaus connected me with Alexandru Popescu from Alphacruncher and I learnt about Nuvolos Cloud.

It’s primarly targeted to educational customers (that’s also where it is coming from), but I also see a great potential for “commercial” datascience. What do I find especially cool?

Snapshot and shared workspaces.

Inside the platform one can easily snapshot and share datasets with others. E.g. a teacher can create an environment for an exercise and share that very environment with the entire class – and each student can continue in her/his personal environment. Only to hand in the solved problem to the teacher afterwards.

Or assume that you wrote a scientific paper, who’s findings are based on a larger dataset and a couple of computations. You cannot only share the PDF, but also provide the possibility to access the full environment and validate the findings – or even build upon them. Magic!

In our case we’re using larger datasets (e.g. a few TB of airborne laserscanning pixel clouds, parcel data and all kind of other readings) that we collaboratively work on. We can use a shared Kernel image with all dependencies installed and work on the same shared data folders – while still preserving our personal preferences and spaces.

Resource efficiency / Shared resources

Data science has the characteristic that you need quite some resources for a rather short amount of time – and then for a larger amount of time you don’t need the instances running. Nuvolos and it’s billing/usage model makes that cloud-elasticity super simple for the user. One can book a base level of resources (that are only spun up when needed, and automatically terminated afterwards), but also book spike resources.

That’s not only convenient for companies, but even more for universities or school classes that need these resources for every student.

Everything in the Cloud

We spoke mainly about Jupyter Notebooks, but the Nuvolos environment also provides access to Snowflake (One of the coolest databases on earth) and many other Tools. In the Cloud. In the browser. In a shared space. The team is working on expanding that toolsuite permanently. Right now it’s RStudio, VS Code, Spyder, JupyterLab, Julia, Stata, Matlab, GNU Octave, SAS, IBM SPSS, REDCap, Airflow and others.

A winning team

During due diligence I took a look at the tech stack and got personally known to some of the core team members. They are not only using state-of-the-art technology and methodology, but also managed to attract top talent to build and operate the product.

I couldn’t be more excited to play a small road on Alexandru’s and his team journey. Thanks for letting me ride with you.

Categories
Crate Entrepreneurship

Podcast: Distributed databases and product-market fit

I had the opportunity to record an episode of the “Digitale Leute Podcast” with my friend Oliver Thielmann from Giant Swarm and would like to reshare it here as well:

featuring electric skateboards and bitcoins

Why industrial IoT startup Crate.io can easily do distributed databases but still had trouble finding product-market fit

Being at number one at Hacker News or winning the TechCrunch Disrupt Startup Battle might help create a hype around your startup. But it doesn’t help finding the product-market fit. Jodok Batlogg, CTO at industrial IoT startup Crate.io explains in this episode why they needed six years to finally hit product-market fit.

Digitale Leute Insights is the podcast for passionate product people. We interview product developers from around the world and take a closer look at their tools and tactics.

Subscribe via: Soundcloud, Spotify, Deezer, Google Podcasts or Apple Podcasts.

When Jodok Batlogg was the CTO at StudiVZ, the largest social network in Germany before Facebook got traction in Europe, the biggest problem they had was data storage. The data of 60 million users were running on about a thousand servers and Docker had not been invented yet. It was clear to Jodok that the amount of data would be growing and the problems with it. 

Four years later, he founded Crate.io with a prototype of CrateDB. The open-source distributed SQL database management system used Elastic Search when that was still “a crazy guy sitting in Israel coding at a new kind of approach on how to deal with distributed computing,” as Jodok puts it in this episode. 

His startup enjoyed two hypes early. The first one was a Hacker News article that resulted in the company going into the “Big Data SQL in real-time” direction. The second boost came after winning the TechCrunch Disrupt Startup Battle, which Jodok completed with a broken fibula. Although it helped to keep the company alive with bringing in investors, they were still to find product-market fit, Jodok admits. “On the product side, it was fine, but from the going-to-market side and on the monetization side, it was totally wrong.” 

How to gain product-market fit

The turnaround came as late as five years after the foundation when the company did a customer survey. The result was a transformation to a more enterprise-focused company concentrating on industrial IoT. They switched the open-source model, which allowed the customers to perceive Crate.io as a product worth buying. It also helped the sales department actually to sell the product.  Before that, even hiring extra sales employees had resulted in zero sales. 

Today Crate.io is a remote-first company, led by Jodok Batlogg as CTO from Dornbirn, a small town in Vorarlberg, Austria. The mountainous country and Jodoks attempt to not use his Audi anymore leads him to try out all the new electric boards, bikes, and gadgets on the market. He shares this passion with our host Oliver Thylmann. This is why they close this episode by discussing electromobility and paying it with bitcoin.

About the Host
Oliver Thylmann is a serial entrepreneur based in Cologne, Germany. He is the co-founder of Giant Swarm, a 35-person SaaS company providing managed microservice infrastructure to big enterprises.