This content originally appeared on DEV Community and was authored by Eric Wang
After working at companies big and small, I often found myself poring over logs to answer business questions for non-technical users. To be honest, a more sophisticated server-side SDK for instrumenting event data would have been ideal, with that data then streamed into a Kafka queue. This would allow me to write an ETL job to transform the data, subsequently storing it in a data warehouse, from which I could integrate with tools like Looker or Tableau so business users can create dashboards themselves! If only there were infinite time and energy for such indulgent engineering projects... It would have been marvelous!
In practice, I wrestled with messy log data until it could be condensed into a nice number suitable for a dashboard. If more data was necessary, I would simply log it in the code, then get back to building features for customers or arguing with strangers on Reddit.
After my cofounder Seb and I joined DoorDash through an acquisition from Bbot—a restaurant technology startup—we had to integrate Bbot data into DoorDash’s data pipeline. This task was intended to answer questions such as the number of completed checkouts per day, average check subtotals, and total daily sales, etc. Once again, we found ourselves in a familiar problem space, that was the last straw, we quit immediately!
In reality, it took another month to gather the courage to actually quit our day jobs, but that's only a minor detail.
We built Siege because we wanted a fast and easy way to get real time data into a format that is easily queryable using plain SQL. We spent a few months building an agent in C that you can just plop into your server(s) to mine data directly from API traffic while using negligible resources. You can then pick and choose the data fields to track with just the click of a button from a catalog of data points. We also built an user-friendly UI that allows you to visualize the data and create real-time dashboards. All in less than 10 minutes.
We have 4 criteria when choosing tools for our own use:
- We hate reading documentation, so it better be short.
- We need to be able to derive value from it within the first 10 minutes.
- We need to be able to play with it for free, no sales calls ever.
- Dark mode.
We were fully committed to these principles while building our own tool. Join our public beta for free! https://siegeai.com
This content originally appeared on DEV Community and was authored by Eric Wang
Eric Wang | Sciencx (2024-06-18T15:34:31+00:00) Real time data pipeline with a single command. Retrieved from https://www.scien.cx/2024/06/18/real-time-data-pipeline-with-a-single-command/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.