I wanted to use the solution in my company for topics, but in the beginning, we did some EOC to use the data analytics part, but then we didn't go with that. We are currently using Amazon Kinesis Data Streams.
Senior Engineering Consultant at ASSURANCE IQ, INC.
Real User
Top 5
2024-06-24T07:46:44Z
Jun 24, 2024
In our project, my company uses Amazon Kinesis and some other products from AWS. There are also products from Azure, like static web applications or when we need good storage. In my new job, my company is completely dependent on Azure. The tool is mostly for the logging part and temporary buffer. In our company, we generate lots of logs and some background events, and we want to send an aggregate of those to our data lake. We use the tool for a data stream or for processing things.
Amazon Kinesis is a queuing or buffering system that we use as a central place to buffer the incoming data we receive from the source. The actual destination is open-faced. Amazon Kinesis is used as a buffer in between to decouple the workload.
It's a kind of data stream tool that actually does kind of external serverless data streaming to different services at Amazon. Especially for the usage I've done, it's basically for getting data to Redshift. Kinesis can be one of the sources, and at the same time, Amazon Kinesis can be used to get the data out to other systems through other APIs. Kinesis is one of the tools I have used for typical data streaming. I've utilized Kinesis Data Streams along with Kinesis Video Streams for different use cases. Again, these are required for working with data as part of Amazon Connect configurations.
To recover data and send it to the cloud. A few of our clients have Amazon Web Services and we use Kinesis to deploy the data to their mobiles and to their data processing system. Also to do data analytics.
Amazon Kinesis is a service in AWS used for data ingestion. We pull data into Kinesis streams from various sources like OCS and then consume it for analysis and reporting.
Senior Data Engineer Consultant at a tech company with 201-500 employees
Real User
Top 10
2023-03-01T14:40:00Z
Mar 1, 2023
We use the solution for streaming data, in simpler terms. For example, there is a backend application; we need to make that data available for analysis. On the backend side, we don't store the history. We get all the events regarding changes incrementally. If something changes, an event is generated. This is a convenient way to keep track of all the changes.
We had real-time streaming of data and a very large volume of user activity. We applied machine learning to the data streams. So, Kinesis basically made sure that we got the data, and we didn't lose the data.
Chief Technology Officer at a tech services company with 51-200 employees
Real User
2021-08-25T19:54:29Z
Aug 25, 2021
We do data acquisition based on what is pumped from the remote data and process it centrally so that we may present to our customers meaningful reports, charts, additional layers of support, or alerts.
Senior Software Engineer at a tech services company with 501-1,000 employees
Real User
2020-12-21T13:55:00Z
Dec 21, 2020
In the simpler use case, we were just pumping in some data. We wanted a product, an AWS service, that would accept data in bursts. We were pushing in, for example, 500 records every 300 milliseconds. What I'm trying to say is per second we were trying to pump in around 1,500 records into some streaming services what we were looking at. That type of streaming information would then go into another source, for example Lambda. Then Lambda would consume the data and ultimately we would process and store it in DynamoDB. This was the basic flow that we had. We were looking for a service. And at that point in time in our organization, the architects were asking us to leverage Kinesis to see how it performed. They wanted to see how it performs, so they were encouraging us to use it. Although we were looking at something as simple as SQS and SNS, they were encouraging us to use Kinesis and that is what we did. There were a few considerations when we moved Kinesis. What is the reliability? When I say reliability, I mean resilience, or the failure mechanism we thought was required for that use case because we did not want to lose data. Also, we wanted to have the ability to replay from a certain point because we were pumping in reports from a data source and we were always keeping track of the point at which we had stopped. So if we wanted to replay something from the prior data which was already processed by Kinesis, and it failed in the Lambda, we wanted to have the ability to retry and replay the previously processed stream. That prompted us to use Kinesis because it has the really good feature of being able to replay for 24 hours whatever you've already processed and this allows us to replay it. That was one key feature that we thought we would need. In fact, performance-wise, it performed really well. We also understood that it is actually meant for streaming, video streaming and stuff like that. Even data streaming. It does a good job with it. But mostly, we saw that it is a more suitable service for video streaming simply because when we actually pump data into Kinesis, we don't know how to test it other than waiting for the data to come out of it from the other end and hook into Lambda and extract data out of it and process it. That's the only way we can test it. That was a drawback but it did not matter too much. But it did matter in the next project, and for the bigger use cases where we used Kinesis. But this project was a simple use case and it served really well, so we kept it as-is. We moved on to the next project, which was bigger. It was an event-driven architecture that we were trying out on one of the features. When we went event-driven, at that time a few of the new features and new services from Amazon which are available right now, were not available. We thought of using Kinesis again to stream the data from one microservice to another in a proper microservice architecture. We were using this as a communication medium between microservices. This is where the testing was a little complicated for us. Ultimately, what we realized out of the entire exercise was that Kinesis may not have been the right choice of service for us for our use case. But what we discovered were the benefits of using Kinesis and also the limitations in certain use cases. The biggest lesson learned for us was even before you take up anything like Kinesis, which is a big AWS service, there has to be a POC, proof of concept, done. To see whether it really suits that use case or not. That is what we ultimately realized. Before that, there were a few other reasons why we chose Kinesis over DynamoDB streaming. Ultimately it was from one microservice to another, and each microservice had its own DynamoDB data store. We were thinking of using the DynamoDB Stream and Kinesis to keep things simple. But it turned out that DynamoDB Streams have a limitation that whatever a stream comes out of DynamoDB it can be consumed only by a single client. But with Kinesis it doesn't matter. Any number of data sources can come in and whatever Kinesis publishes can be consumed by any number of clients. That is why we went with Kinesis in order to see how it performed. Because even performance-wise, we found that we need a crazy load server because we are part of the wagering industry, which needs peak performance. Online betting. In Australia, it's a regulated market and one of the most happening businesses. Here, performance is really important, because there are quite a few competitors, around 10 to 15 prominent competitors and if we have to stand out, our performance has to be beyond the customer's expectation. So, with that in mind, they knew our performance had to scale up. That is where we found the advantage of using Kinesis. It's been reliable. It has not failed to publish. It actually did fail, but the failure was simply because of pumping in too much data than what Kinesis can take in. There is a limit that we discovered. I don't remember the numbers there. But we did manage to break Kinesis by pumping in too much data.
Chapter Lead - Data and Infrastructure (Head of Department) at a media company with 51-200 employees
Real User
2020-11-19T03:56:48Z
Nov 19, 2020
Our primary use case of this solution is as an intricate part of our data pipeline to deal with all of our big data problems. The traffic in our industry is highly volatile. At any given time we could have 10,000 users, and five minutes later it could be 100,000. We need systems fast enough to deal with that elasticity of demand, and the ability to deal with all the big data problems. Volume, velocity, ferocity, things like that. That's where we use the Kinesis platform. They have different iterations of it. The normal Kinesis Stream, is a little bit more manual, but we use that for our legacy technology, and for the more recent ones, we use Kinesis Firehose.
We use this solution for quite large environments. We use it to capture and process a lot of data. We use it, for example for data analytics and query and analyze a stream's data.
Senior Engineering Consultant at a tech services company with 201-500 employees
Real User
2020-10-28T15:29:13Z
Oct 28, 2020
As part of my interest in obtaining Amazon certification and learning more about Kinesis, I am currently using it to capture streaming Twitter data. I get an avalanche of tweets and I need some technology to harness and capture them. I have used the streaming Twitter API to deal with it. Twitter is updated every half a second, so I'm tapping into the streaming API and capturing a lot of stuff. It has also been used for the Internet of Things (IoT), where there is a lot of streaming stuff that comes out and you need a mechanism to capture all of it from your devices. This includes things such as logs. My company was recently working on a project with Kinesis where we were capturing data from racecars. These racecars were emitting tons of data and it needed to be captured by some kind of tool for analytics. Kinesis was used to capture all of that information. The basic use case is just capturing the data. In the streams, you can do some sort of interim transformations but for the most part, the basic use case is just capturing data and persisting it in a data store like Amazon S3. Another example is Elastic MapReduce permanent storage. Once it lands in some kind of permanent store, further transformations or aggregations can be done at that point.
One use case is consuming sales data and then writing it back into the S3. That's one small use case that we have; from data Shields to data Firehose, from data Firehose to Amazon S3. There are OneClick data streams that are coming in. For click streams data we established Kinesis data streams and then from Kinesis data streams, we dump data into the S3 using Kinesis Data Firehose. This is the main use case that we have. We did many POC's on Kinesis, as well. Also, one more live project using the DynamoDB database is running in Amazon. From DynamoDB we have triggers that automatically trigger to Lambda, and then from Lambda we call Kinesis then Kinesis writes back into the S3. This is another use case. Another thing that we did is called Kinesis data analytics where you can directly stream data. For that, we use a Kinesis data producer. From that, we establish a connection to the data stream and then from the data streams to the SQL, which is the Kinesis data analytics tool. From Kinesis analytics, we again establish a connection to the data Firehose and then drive data back into the S3. These are the main use cases that we have for working on Amazon Kinesis.
IT Linux Administrator and Cloud Architect at Gateway Gulf
Real User
2020-10-25T23:39:00Z
Oct 25, 2020
We use it for: * Security * DDoS attacks * Server application firewall * Kinesis makes it easy to collect the process * Real-time analysis * Streaming data We can get insight and react quickly to new information.
In terms of use cases, it depends of which component we're talking about, as we use three of the 4 components. The only one we don't use is the Video Streams. Kinesis Data Stream is the module that we have been using the longest, essentially we use it to hold data which will be processed by multiple consumers. We have multiple data sources and we use Kinesis to funnel that data which is then consumed by multiple other consumers. We gather data coming from IoT devices, user phones, databases and a variety of other sources and then, as we have multiple consumers, we use Kinesis to actually gather the data and then we process it directly in Lambda, in Firehose, or in other applications.
Senior Software Engineer at a computer software company with 201-500 employees
Real User
2020-10-20T04:19:00Z
Oct 20, 2020
I work as a senior software engineer in eCommerce analytics company, we have to process a huge amount of data. Only a few people within our organization use Kinesis. My team, which includes three backend developers, simply wanted to test out different approaches. We are now in the middle of migrating our existing databases in MySQL and Postgres, to Snowflake. We use Kinesis Firehose to ingest data in Snowflake at the same time that we ingest data in MySQL, without it impacting any performance. If you ingest two databases in a synchronous way, then the performance is very slow. We wanted to avoid that so we came up with this solution to ingest the data in the stream. We use Kinesis Firehose to send the data to the stream, which then buffers the data for roughly two minutes. Afterwards, it places the files in an S3 bucket, which is then loaded automatically, via an integration with Snowflake that's called Snowpipe. Snowpipe reads and ingests every message and every file that's in the S3 bucket. This stage doesn't bother us because we don't need to wait for it. We just stream the data — fire and forget. Sometimes, if the record is not ingested successfully, we have to retry. Apart from that, it's great because we don't need to wait and the performance is great. There are some caveats there, but overall, the performance and the reality of it all has been great. This year, 100% of the time when there was an issue in production, it was due to a bug in our code rather than a bug in Kinesis.
Principal Data Engineer at a transportation company with 1,001-5,000 employees
Real User
2020-10-15T11:35:04Z
Oct 15, 2020
Our primary use case of this solution is for a streaming bus architecture, we get events and they come in through Kinesis like a Jason event. It's usually a change to a database, but it can be any event such as in our application, which feeds into the Kinesis and then we have a Lambda that consumes them and then finally it puts those into a data warehouse which is the ultimate goal. So it's a near real-time data warehouse.
Amazon Kinesis makes it easy to collect, process, and analyze real-time, streaming data so you can get timely insights and react quickly to new information. Amazon Kinesis offers key capabilities to cost-effectively process streaming data at any scale, along with the flexibility to choose the tools that best suit the requirements of your application. With Amazon Kinesis, you can ingest real-time data such as video, audio, application logs, website clickstreams, and IoT telemetry data for...
I wanted to use the solution in my company for topics, but in the beginning, we did some EOC to use the data analytics part, but then we didn't go with that. We are currently using Amazon Kinesis Data Streams.
In our project, my company uses Amazon Kinesis and some other products from AWS. There are also products from Azure, like static web applications or when we need good storage. In my new job, my company is completely dependent on Azure. The tool is mostly for the logging part and temporary buffer. In our company, we generate lots of logs and some background events, and we want to send an aggregate of those to our data lake. We use the tool for a data stream or for processing things.
Amazon Kinesis is a queuing or buffering system that we use as a central place to buffer the incoming data we receive from the source. The actual destination is open-faced. Amazon Kinesis is used as a buffer in between to decouple the workload.
It's a kind of data stream tool that actually does kind of external serverless data streaming to different services at Amazon. Especially for the usage I've done, it's basically for getting data to Redshift. Kinesis can be one of the sources, and at the same time, Amazon Kinesis can be used to get the data out to other systems through other APIs. Kinesis is one of the tools I have used for typical data streaming. I've utilized Kinesis Data Streams along with Kinesis Video Streams for different use cases. Again, these are required for working with data as part of Amazon Connect configurations.
I use the solution in my company for streaming purposes, considering that my company has an AI-based camera for streaming.
To recover data and send it to the cloud. A few of our clients have Amazon Web Services and we use Kinesis to deploy the data to their mobiles and to their data processing system. Also to do data analytics.
I work in a gaming company that builds games for the global market. We use Amazon Kinesis to stream events.
Amazon Kinesis is a service in AWS used for data ingestion. We pull data into Kinesis streams from various sources like OCS and then consume it for analysis and reporting.
We use the solution for streaming data, in simpler terms. For example, there is a backend application; we need to make that data available for analysis. On the backend side, we don't store the history. We get all the events regarding changes incrementally. If something changes, an event is generated. This is a convenient way to keep track of all the changes.
We collect data from AWS IoT Core and then capture the stream in Amazon Kinesis. The data is then stored in S3 and shifted to Snowflake for analysis.
We are using Kinesis' third-party streaming engine. We are using the AWS cloud and are moving to Azure.
We had real-time streaming of data and a very large volume of user activity. We applied machine learning to the data streams. So, Kinesis basically made sure that we got the data, and we didn't lose the data.
We do data acquisition based on what is pumped from the remote data and process it centrally so that we may present to our customers meaningful reports, charts, additional layers of support, or alerts.
In the simpler use case, we were just pumping in some data. We wanted a product, an AWS service, that would accept data in bursts. We were pushing in, for example, 500 records every 300 milliseconds. What I'm trying to say is per second we were trying to pump in around 1,500 records into some streaming services what we were looking at. That type of streaming information would then go into another source, for example Lambda. Then Lambda would consume the data and ultimately we would process and store it in DynamoDB. This was the basic flow that we had. We were looking for a service. And at that point in time in our organization, the architects were asking us to leverage Kinesis to see how it performed. They wanted to see how it performs, so they were encouraging us to use it. Although we were looking at something as simple as SQS and SNS, they were encouraging us to use Kinesis and that is what we did. There were a few considerations when we moved Kinesis. What is the reliability? When I say reliability, I mean resilience, or the failure mechanism we thought was required for that use case because we did not want to lose data. Also, we wanted to have the ability to replay from a certain point because we were pumping in reports from a data source and we were always keeping track of the point at which we had stopped. So if we wanted to replay something from the prior data which was already processed by Kinesis, and it failed in the Lambda, we wanted to have the ability to retry and replay the previously processed stream. That prompted us to use Kinesis because it has the really good feature of being able to replay for 24 hours whatever you've already processed and this allows us to replay it. That was one key feature that we thought we would need. In fact, performance-wise, it performed really well. We also understood that it is actually meant for streaming, video streaming and stuff like that. Even data streaming. It does a good job with it. But mostly, we saw that it is a more suitable service for video streaming simply because when we actually pump data into Kinesis, we don't know how to test it other than waiting for the data to come out of it from the other end and hook into Lambda and extract data out of it and process it. That's the only way we can test it. That was a drawback but it did not matter too much. But it did matter in the next project, and for the bigger use cases where we used Kinesis. But this project was a simple use case and it served really well, so we kept it as-is. We moved on to the next project, which was bigger. It was an event-driven architecture that we were trying out on one of the features. When we went event-driven, at that time a few of the new features and new services from Amazon which are available right now, were not available. We thought of using Kinesis again to stream the data from one microservice to another in a proper microservice architecture. We were using this as a communication medium between microservices. This is where the testing was a little complicated for us. Ultimately, what we realized out of the entire exercise was that Kinesis may not have been the right choice of service for us for our use case. But what we discovered were the benefits of using Kinesis and also the limitations in certain use cases. The biggest lesson learned for us was even before you take up anything like Kinesis, which is a big AWS service, there has to be a POC, proof of concept, done. To see whether it really suits that use case or not. That is what we ultimately realized. Before that, there were a few other reasons why we chose Kinesis over DynamoDB streaming. Ultimately it was from one microservice to another, and each microservice had its own DynamoDB data store. We were thinking of using the DynamoDB Stream and Kinesis to keep things simple. But it turned out that DynamoDB Streams have a limitation that whatever a stream comes out of DynamoDB it can be consumed only by a single client. But with Kinesis it doesn't matter. Any number of data sources can come in and whatever Kinesis publishes can be consumed by any number of clients. That is why we went with Kinesis in order to see how it performed. Because even performance-wise, we found that we need a crazy load server because we are part of the wagering industry, which needs peak performance. Online betting. In Australia, it's a regulated market and one of the most happening businesses. Here, performance is really important, because there are quite a few competitors, around 10 to 15 prominent competitors and if we have to stand out, our performance has to be beyond the customer's expectation. So, with that in mind, they knew our performance had to scale up. That is where we found the advantage of using Kinesis. It's been reliable. It has not failed to publish. It actually did fail, but the failure was simply because of pumping in too much data than what Kinesis can take in. There is a limit that we discovered. I don't remember the numbers there. But we did manage to break Kinesis by pumping in too much data.
Our primary use case of this solution is as an intricate part of our data pipeline to deal with all of our big data problems. The traffic in our industry is highly volatile. At any given time we could have 10,000 users, and five minutes later it could be 100,000. We need systems fast enough to deal with that elasticity of demand, and the ability to deal with all the big data problems. Volume, velocity, ferocity, things like that. That's where we use the Kinesis platform. They have different iterations of it. The normal Kinesis Stream, is a little bit more manual, but we use that for our legacy technology, and for the more recent ones, we use Kinesis Firehose.
We use this solution for quite large environments. We use it to capture and process a lot of data. We use it, for example for data analytics and query and analyze a stream's data.
As part of my interest in obtaining Amazon certification and learning more about Kinesis, I am currently using it to capture streaming Twitter data. I get an avalanche of tweets and I need some technology to harness and capture them. I have used the streaming Twitter API to deal with it. Twitter is updated every half a second, so I'm tapping into the streaming API and capturing a lot of stuff. It has also been used for the Internet of Things (IoT), where there is a lot of streaming stuff that comes out and you need a mechanism to capture all of it from your devices. This includes things such as logs. My company was recently working on a project with Kinesis where we were capturing data from racecars. These racecars were emitting tons of data and it needed to be captured by some kind of tool for analytics. Kinesis was used to capture all of that information. The basic use case is just capturing the data. In the streams, you can do some sort of interim transformations but for the most part, the basic use case is just capturing data and persisting it in a data store like Amazon S3. Another example is Elastic MapReduce permanent storage. Once it lands in some kind of permanent store, further transformations or aggregations can be done at that point.
One use case is consuming sales data and then writing it back into the S3. That's one small use case that we have; from data Shields to data Firehose, from data Firehose to Amazon S3. There are OneClick data streams that are coming in. For click streams data we established Kinesis data streams and then from Kinesis data streams, we dump data into the S3 using Kinesis Data Firehose. This is the main use case that we have. We did many POC's on Kinesis, as well. Also, one more live project using the DynamoDB database is running in Amazon. From DynamoDB we have triggers that automatically trigger to Lambda, and then from Lambda we call Kinesis then Kinesis writes back into the S3. This is another use case. Another thing that we did is called Kinesis data analytics where you can directly stream data. For that, we use a Kinesis data producer. From that, we establish a connection to the data stream and then from the data streams to the SQL, which is the Kinesis data analytics tool. From Kinesis analytics, we again establish a connection to the data Firehose and then drive data back into the S3. These are the main use cases that we have for working on Amazon Kinesis.
We use it for: * Security * DDoS attacks * Server application firewall * Kinesis makes it easy to collect the process * Real-time analysis * Streaming data We can get insight and react quickly to new information.
In terms of use cases, it depends of which component we're talking about, as we use three of the 4 components. The only one we don't use is the Video Streams. Kinesis Data Stream is the module that we have been using the longest, essentially we use it to hold data which will be processed by multiple consumers. We have multiple data sources and we use Kinesis to funnel that data which is then consumed by multiple other consumers. We gather data coming from IoT devices, user phones, databases and a variety of other sources and then, as we have multiple consumers, we use Kinesis to actually gather the data and then we process it directly in Lambda, in Firehose, or in other applications.
I work as a senior software engineer in eCommerce analytics company, we have to process a huge amount of data. Only a few people within our organization use Kinesis. My team, which includes three backend developers, simply wanted to test out different approaches. We are now in the middle of migrating our existing databases in MySQL and Postgres, to Snowflake. We use Kinesis Firehose to ingest data in Snowflake at the same time that we ingest data in MySQL, without it impacting any performance. If you ingest two databases in a synchronous way, then the performance is very slow. We wanted to avoid that so we came up with this solution to ingest the data in the stream. We use Kinesis Firehose to send the data to the stream, which then buffers the data for roughly two minutes. Afterwards, it places the files in an S3 bucket, which is then loaded automatically, via an integration with Snowflake that's called Snowpipe. Snowpipe reads and ingests every message and every file that's in the S3 bucket. This stage doesn't bother us because we don't need to wait for it. We just stream the data — fire and forget. Sometimes, if the record is not ingested successfully, we have to retry. Apart from that, it's great because we don't need to wait and the performance is great. There are some caveats there, but overall, the performance and the reality of it all has been great. This year, 100% of the time when there was an issue in production, it was due to a bug in our code rather than a bug in Kinesis.
Our primary use case of this solution is for a streaming bus architecture, we get events and they come in through Kinesis like a Jason event. It's usually a change to a database, but it can be any event such as in our application, which feeds into the Kinesis and then we have a Lambda that consumes them and then finally it puts those into a data warehouse which is the ultimate goal. So it's a near real-time data warehouse.