Before knowing about the Kinesis, you should know about the streaming data.
What is streaming data?
Streaming data is data which is generated continuously from thousands of data sources, and these data sources can send the data records simultaneously and in small size.
Following are the examples of streaming data:
- Purchases from online stores
People buying stuff on amazon.com and generates streaming data and that streaming data can be transactions, product, etc. - Stock prices
Stock price is also an example of streaming data. - Game data
Suppose the user is playing an angry bird game and the application is generating streaming data back to the central server. This streaming data could be "what the user is doing", "what is the score". - Social network data
Social network data is also another example of streaming data. Suppose you visit on Facebook, update your status, and put a post on your friend's wall. All these data would then be streamed. - Geospatial data
When you are using uber, and your device is connected to the internet. Uber application is constantly saying that where the uber driver is, where you are, and it is interrogating the map to give you the best possible route to your destination. This is also a good example of streaming data. - iOT Sensor Data
It senses the all around world monitoring temperature.
What is Kinesis?
Kinesis is a platform on AWS that sends your streaming data. It makes it easy to analyze load streaming data and also provides the ability for you to build custom applications based on your business needs.
Core Services of Kinesis
- Kinesis Streams
- Kinesis Firehose
- Kinesis Analytics
Kinesis Streams
- Kinesis streams consist of shards.
- Shards provide 5 transactions per second for reads, up to a maximum total data read rate of 2MB per second and up to 1,000 records per second for writes up to a maximum total data write rate of 1MB per second.
- The data capacity of your stream is a function of the number of shards that you specify for the data stream. The total capacity of the Kinesis stream is the sum of the capacities of all shards.
Architecture of Kinesis Stream
Suppose we have got the EC2, mobile phones, Laptops, IOT which are producing the data. They are known as producers as they produce the data. The data is moved to the Kinesis streams and stored in the shard. By default, the data is stored in shards for 24 hours. You can increase the time to 7 days of retention. Once the data is stored in shards, then you have EC2 instances which are known as consumers. They take the data from shards and turned it into useful data. Once the consumers have performed its calculation, then the useful data is moved to either of the AWS services, i.e., DynamoDB, S3, EMR, Redshift.
Kinesis Firehouse
- Kinesis Firehose is a service used for delivering streaming data to destinations such as Amazon S3, Amazon Redshift, Amazon Elasticsearch.
- With Kinesis Firehouse, you do not have to manage the resources.
Architecture of Kinesis Firehose
The other location can be Redshift. First, you have to write to S3 and then copy it to the Redshift.
Kinesis Analytics
Kinesis Analytics is a service of Kinesis in which streaming data is processed and analyzed using standard SQL.Architecture of Kinesis Analytics
Differences b/w Kinesis Streams & Kinesis Firehose
- Kinesis stream is manually managed while Kinesis Firehose is fully automated managed.
- Kinesis stream sends the data to many services while Kinesis Firehose sends the data only to S3 or Redshift.
- Kinesis stream consists of an automatic retention window whose default time is 24 hours and can be extended to 7 days while Kinesis Firehose does not have automatic retention window.
- Kinesis streams send the data to consumers for analyzing and processing while kinesis firehose does not have to worry about consumers as kinesis firehose itself analyzes the data by using a lambda function.
No comments:
Post a Comment