Badges
55 Points
3 Years
User Activity
Over 2 years ago
Answered a question: What are the benefits of streaming analytics tools?
We are in a world where data drives decisions. Every business should be capable of taking decisions in real-time or near real-time. This is where the Stream analytics comes to the rescue.
Stream analytics helps businesses take quick decisions based on the data so that they…
About 3 years ago
Answered a question: What is your primary use case for Apache Airflow?
The primary use case is the orchestration and automation of ELT/ETL data pipelines.
Apache Airflow is great in this respect and there are scheduling options to make it fully automated based on the used case.
About 3 years ago
Yeah, the pricing is something which I too felt can be more open and explicit for the customers.
Over 3 years ago
Answered a question: What is your experience regarding pricing and costs for Apache Airflow?
The open source Apache Airflow is a free to use. It itself does not incur any cost. But the managed solution by AWS or GCP have cost and other packaged product like Astronomer too.
Over 3 years ago
Over 3 years ago
Answered a question: What is your primary use case for Apache Spark Streaming?
Near Real time analytics using Near real time data ingestion.
Over 3 years ago
Answered a question: Apache Spark without Hadoop -- Is this recommended?
I don't think using Apache Spark without Hadoop has any major drawbacks or issues. I have used Apache Spark quite successfully with AWS S3 on many projects which are batch based. Yes for very high performance system HDFS is a better option.
The main problem with Apache…
Over 3 years ago
Answered a question: Which is better - Azure Synapse Analytics or Snowflake?
If you are dealing with semi-structured data like json Snowflake has great support in handling and querying json data. it is also good to use as data lake and can act as one stop solution for a data lake and cloud data warehouse
query performance and low maintainability is…
Over 3 years ago
Answered a question: What is your experience regarding pricing and costs for Snowflake?
It is little more costly but it has great features to keep the control on pricing if utilised properly like different warehouse sizes, caching, auto-suspend of warehouses and some more.
Over 3 years ago
Answered a question: What are the key reasons for choosing Snowflake as a data lake over other data lake solutions?
handling of semi-structured data like json. It has great support for json and we can write sql on json which is amazing. the performance on semi-structured data is little poor as compared to structured data but it is still great.
Over 3 years ago
Commented on Good at autoscaling and has a nice time machine feature but they need to add a basic ETL framework
Very good review on Snowflake, very helpful.
Over 3 years ago
Answered a question: What do you like most about Apache Airflow?
Apache Airflow is a great orchestration and automation tool. Its connectivity with other systems is a great plus point. The interactive UI, the options for scheduling and the very fact that its compatibility with Python.
Over 3 years ago
Answered a question: What advice do you have for others considering Cloudera Distribution for Hadoop?
The CDP I used was almost 2.5 years ago on-premise. I would rate it 8/10. I did not have much to compare against in those days and due to Cloud not accessible in my organisation. But, definitely CDP was a good choice then wrt to open source distribution. The installation was…
Over 3 years ago
Have you used Azure Data Governance tool Purview ? If yes, what's your view and is it mature enough?
Over 3 years ago
Answered a question: What advice do you have for others considering Snowflake?
Snowflake is an amazing Product. It is one of the best Warehouses currently in for Cloud. Separation of store and compute and the Warehouse concept makes this unique and it has lots of features, low maintenance and the cost can be optimised to a great extent if we understand…
Over 3 years ago
Contributed a review of Snowflake: Automatically scales as needed and supports JSON, XML, and Parquet files
Over 3 years ago
Answered a question: What do you like most about Snowflake?
Many features:
1) Separate warehouse and the control user gets on it.
2) Auto caching features
3) Json and XML handling
4) Minimal DBA activity
Over 3 years ago
Answered a question: What is your primary use case for Snowflake?
We are using it as a Datalake and a DWH.
Over 3 years ago
Have you used Azure Purview for Data Governance, Data Lineage and as Data Catalog ?
Over 3 years ago
Answered a question: When evaluating Infrastructure as a Service (IaaS), what aspect do you think is the most important to look for?
1. TCO : options of Long term commitment vs pay as you go
2. Ease of setup , security & performance
3. High availability & Support
Almost 4 years ago
Contributed a review of Apache Spark: Easy to code, fast, open-source, very scalable, and great for big data
Reviews
Almost 4 years ago
Apache Spark
Questions
Answers
Over 3 years ago
Business Process Management (BPM)
Over 3 years ago
Software Configuration Management
Over 3 years ago
Cloud Analytics
Over 3 years ago
Hadoop
Over 3 years ago
Infrastructure as a Service Clouds (IaaS)
Comments
About 3 years ago
Data Warehouse
Over 3 years ago
Data Warehouse
Over 3 years ago
Infrastructure as a Service Clouds (IaaS)
Over 3 years ago
Infrastructure as a Service Clouds (IaaS)
About me
I have 19+ years experience in building software. I primarily worked on Java, Spring and Database systems initially and then moved to Distributed systems in the last 8 years. I have worked on Apache Spark, Snowflake, Hadoop, Hbase, Hive, Kafka, Hbase etc. Also, have got exposure to work on multiple cloud technologies on AWS, Azure and GCP.