🌟 It's the happiest of Fridays! Our humble little open source library Quix Streams just passed 1,000 GitHub stars! A huge thank you to our wonderful community for voting YES to powerful and simpler stream processing in Python. 💎 We'd also like to announce that Quix Streams v2.7.0 is now out and it adds a highly requested and powerful feature: exactly-once processing. ❤️ If you haven't explored Quix Streams yet, today is a great day to do it. Join our community on Slack, let's gaze at the stars and build the future together!
Quix
Software Development
London, England 2,850 followers
Developer tools to build data apps on Kafka — with Python.
About us
Quix is an open source Python library for processing data in Kafka. Designed around DataFrames API, it provides a best in class Python developer experience for building real-time ML and analytics pipeline. Stateful, scalable and fault tolerant. No wrappers. No JVM. No cross-language debugging. Deploy locally or on Quix Cloud for easy management. Experience the simplicity and power of stream processing with Quix.
- Website
-
https://quix.io/
External link for Quix
- Industry
- Software Development
- Company size
- 11-50 employees
- Headquarters
- London, England
- Type
- Privately Held
- Founded
- 2020
- Specialties
- Streaming Analytics, Serverless, Event-Driven Architecture, Cloud, Event Data, Machine Learning, Metadata, and Binary Blob Data
Locations
Employees at Quix
Updates
-
Checkout this stream processing pipeline Tomáš Neubauer built for the UK Election, today. The app integrates the following stack using Quix Streams ❤️ Python 🐍 and Kafka 🎼 💬Reddit, Inc. API’s provide the source data 🔥OpenAI is used for sentiment scoring 🪣InfluxData for data storeage 😎Streamlit for data visualisation Try building it for yourself on Quix Cloud, for free.
As promised here is the version of the real-time analysis of Reddit for the UK parliament election tomorrow. Link to Streamlit in the comment bellow.
-
-
This second episode in the course covers the basics of stateful transformations. You’ll learn the concepts, understand why state is useful and often required and lastly get to grips with state in your code.
Quix Academy | Ep 2 | Stateful Data Transformation Basics
www.linkedin.com
-
Quix reposted this
🧑🏫 This third episode in the course expands on stateful transformations from the last episode with an industrial use case: downsampling. You’ll learn about time series data, the challenges when dealing with high velocity data and why downsampling is often required. You’ll be introduced to InfluxDB (a time series database) and work through solving common challenges in a coding session.
Quix Academy | Ep 3 | Stateful Data Transformation Use Case: Downsampling
www.linkedin.com
-
🧑🏫 This third episode in the course expands on stateful transformations from the last episode with an industrial use case: downsampling. You’ll learn about time series data, the challenges when dealing with high velocity data and why downsampling is often required. You’ll be introduced to InfluxDB (a time series database) and work through solving common challenges in a coding session.
Quix Academy | Ep 3 | Stateful Data Transformation Use Case: Downsampling
www.linkedin.com
-
There are only a couple of days left till the second episode of our Academy Course. This time Tomáš Neubauer (our CTO and Python guru) takes you through stateful data transformations using Quix Streams, live from his comfy chair. Checkout the info below for when and where. If you haven't seen episode 1 yet you can watch it here: https://lnkd.in/ez4gbd5r
This second episode in the course covers the basics of stateful transformations. You’ll learn the concepts, understand why state is useful and often required and lastly get to grips with state in your code.
Quix Academy | Ep 2 | Stateful Data Transformation Basics
www.linkedin.com
-
What do you get when you cross Quix Streams and OpenAIs ChatGPT? Well, we don't know what that would be called but you'll certainly find Tomáš Neubauer in the middle of it like a kid that got all the candy! If you do one thing today, make it be having a look at this code, it's a fun and useful project. Extend it and share your work, we'd love to see what you do with it.
As there are many elections around the world right now, I have put together this little stream processing pipeline to analyse Reddit data in real-time to measure how people react to unfolding situations during debates, election nights, etc. Feel free to fork it and change the analysis, right now it is a blend of ChatGPT and transformers. Streamlit dashboard: https://lnkd.in/eawYmiwH Link to the repo in the comments below. Working on the UK and France versions right now...
-
We had a fantastic session with Zain Hasan from Weaviate yesterday. Did you managed to catch it live? If not watch it now on YouTube 👇👇 📺 https://lnkd.in/exgna8Gh "Powering Vector Search with Real-Time Data" was also supposed to air on LinkedIn but due to a technical hitch that didn't happen. Apologies if you were hoping to see here 🥲
Powering Vector Search with Real-Time Data
https://www.youtube.com/
-
Quix reposted this
Senior Machine Learning Engineer • MLOps • Founder @ Decoding ML ~ Posts and articles about building production-grade ML/AI systems.
Flink and Kafka Streams will be replaced by these 2 streaming engines, which 100% integrate into Python's ecosystem for streaming applications. For some quick context, a streaming system consists of 2 core components: - a data persisting layer (e.g., Kafka, Red Pandas) - a streaming engine (e.g., Flink, Kafka Streams) → The streaming engine reads data from the persisting layer's topics, processes it, and writes it to another topic. Until now, the most popular choice for the streaming engine was Flink. But for all the ML folks out there, it has a HUGE problem. It is based on JVM and dependent on the programming languages around it (e.g., Java, Scala, Kotlin). In most use cases, the Data Engineering team takes the prototype from the Data Science team and rewrites it in Flink. Unfortunately, this rewriting process usually results in the training-serving skew, as the slightest difference in how the data is processed between training and inference can completely break the ML model. The solution? Use a streaming engine that supports Python, which can be directly integrated with your ML training code. . 𝘏𝘦𝘳𝘦 𝘢𝘳𝘦 𝘵𝘩𝘦 2 𝘣𝘪𝘨 𝘱𝘭𝘢𝘺𝘦𝘳𝘴 𝘸𝘩𝘦𝘯 𝘪𝘵 𝘤𝘰𝘮𝘦𝘴 𝘵𝘰 𝘴𝘵𝘳𝘦𝘢𝘮𝘪𝘯𝘨 𝘦𝘯𝘨𝘪𝘯𝘦𝘴 𝘪𝘯 𝘗𝘺𝘵𝘩𝘰𝘯 ↓↓↓ 𝗕𝘆𝘁𝗲𝘄𝗮𝘅 - Is is built in Rust for speed and resource optimization. - It exposes a Python interface to leverage it in AI applications fully. - Follows an Object Oriented Programming (OOP) approach being suited for enterprise applications - It has built-in connectors for Kafka, SQS, and more! →🔗 https://rebrand.ly/bytewax 𝗤𝘂𝗶𝘅 𝗦𝘁𝗿𝗲𝗮𝗺𝘀 - It is built in Python for ease of use and debugging. - Follows almost 100% of Pandas's API, making it perfect for data scientists who work all day long with Pandas. - Scalable as it leverages Kafka and K8s to provide data partitioning, consumer groups, state management and replication - Has built-in Kafka connectors →🔗 https://quix.io/ . 𝘛𝘰 𝘤𝘰𝘯𝘤𝘭𝘶𝘥𝘦: - both follow the "Python. No JVM." paradigm - @company_bytewax suits MLE and SWE, which want to build more complex applications. - @company_quix-io is perfect for DS, which wants to leverage the power of a streaming engine. #machinelearning #mlops #datascience . 💡 Follow me for daily content on production ML and MLOps engineering.
-
-
Quix reposted this
Senior Machine Learning Engineer • MLOps • Founder @ Decoding ML ~ Posts and articles about building production-grade ML/AI systems.
Flink and Kafka Streams will be replaced by these 2 streaming engines, which 100% integrate into Python's ecosystem for streaming applications. For some quick context, a streaming system consists of 2 core components: - a data persisting layer (e.g., Kafka, Red Pandas) - a streaming engine (e.g., Flink, Kafka Streams) → The streaming engine reads data from the persisting layer's topics, processes it, and writes it to another topic. Until now, the most popular choice for the streaming engine was Flink. But for all the ML folks out there, it has a HUGE problem. It is based on JVM and dependent on the programming languages around it (e.g., Java, Scala, Kotlin). In most use cases, the Data Engineering team takes the prototype from the Data Science team and rewrites it in Flink. Unfortunately, this rewriting process usually results in the training-serving skew, as the slightest difference in how the data is processed between training and inference can completely break the ML model. The solution? Use a streaming engine that supports Python, which can be directly integrated with your ML training code. . 𝘏𝘦𝘳𝘦 𝘢𝘳𝘦 𝘵𝘩𝘦 2 𝘣𝘪𝘨 𝘱𝘭𝘢𝘺𝘦𝘳𝘴 𝘸𝘩𝘦𝘯 𝘪𝘵 𝘤𝘰𝘮𝘦𝘴 𝘵𝘰 𝘴𝘵𝘳𝘦𝘢𝘮𝘪𝘯𝘨 𝘦𝘯𝘨𝘪𝘯𝘦𝘴 𝘪𝘯 𝘗𝘺𝘵𝘩𝘰𝘯 ↓↓↓ 𝗕𝘆𝘁𝗲𝘄𝗮𝘅 - Is is built in Rust for speed and resource optimization. - It exposes a Python interface to leverage it in AI applications fully. - Follows an Object Oriented Programming (OOP) approach being suited for enterprise applications - It has built-in connectors for Kafka, SQS, and more! →🔗 https://rebrand.ly/bytewax 𝗤𝘂𝗶𝘅 𝗦𝘁𝗿𝗲𝗮𝗺𝘀 - It is built in Python for ease of use and debugging. - Follows almost 100% of Pandas's API, making it perfect for data scientists who work all day long with Pandas. - Scalable as it leverages Kafka and K8s to provide data partitioning, consumer groups, state management and replication - Has built-in Kafka connectors →🔗 https://quix.io/ . 𝘛𝘰 𝘤𝘰𝘯𝘤𝘭𝘶𝘥𝘦: - both follow the "Python. No JVM." paradigm - @company_bytewax suits MLE and SWE, which want to build more complex applications. - @company_quix-io is perfect for DS, which wants to leverage the power of a streaming engine. #machinelearning #mlops #datascience . 💡 Follow me for daily content on production ML and MLOps engineering.
-