The primary use case is for data ingestion. We current have HDP 2.6 installed on Ubuntu 16.04.
Data Scientest at a wellness & fitness company with 51-200 employees
Data ingestion has reduced manual effort to import data
What is our primary use case?
How has it helped my organization?
Has reduced manual effort to import data.
What is most valuable?
Data ingestion
What needs improvement?
Not enough material is available for beginners.
Buyer's Guide
Talend Data Quality
December 2024
Learn what your peers think about Talend Data Quality. Get advice and tips from experienced pros sharing their opinions. Updated: December 2024.
831,265 professionals have used our research since 2012.
For how long have I used the solution?
Less than one year.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Principal Developer
It lowers the amount of time in development from weeks to a day
Pros and Cons
- "It lowers the amount of time in development from weeks to a day."
- "If the SQL input controls could dynamically determine the schema-based on the SQL alone, it would simplify the steps of having to use a manually created and saved schema for use in the TMap for the Postgres and Redshift components. This would make things even easier."
What is our primary use case?
We use it to load our big data system with S3 and Redshift. We also use it to process in HL7 from hospitals in real-time.
How has it helped my organization?
It lowers the amount of time in development from weeks to a day.
What is most valuable?
The ease of transforming data with inputs to TMaps and tJavaRow makes life so easy.
What needs improvement?
There is one place where I would appreciate an upgrade, if it is possible. If the SQL input controls could dynamically determine the schema-based on the SQL alone, it would simplify the steps of having to use a manually created and saved schema for use in the TMap for the Postgres and Redshift components. This would make things even easier. When it does guess the schema it tends to bring back every column from every table or every column from the table specified in the table name in the component. Sometimes, the SQL comes from multiple tables and has some transformations of data.
I do not know if it would even be possible, but if this could be figured out automatically for the column names and types, that would be amazing.
For how long have I used the solution?
More than five years.
What other advice do I have?
I have not run into anything we could not use Talend to find a solution for.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Buyer's Guide
Talend Data Quality
December 2024
Learn what your peers think about Talend Data Quality. Get advice and tips from experienced pros sharing their opinions. Updated: December 2024.
831,265 professionals have used our research since 2012.
VP of Professional Services at a tech services company with 51-200 employees
Enables robust data matching, merging, Data Stewardship; needs operationalization of meta data
Pros and Cons
- "The solution enables robust data matching, merging, survivorship, and Data Stewardship that can be a part of data quality workflows or true master data management."
- "Needs integrated data governance in terms of dictionaries, glossaries, data lineage, and impact analysis. It also needs operationalization of meta-data."
What is our primary use case?
- Fixing data by using regular expressions or synonyms and Data Stewardship.
- Using data profiling to gauge the quality of the data before and after it’s used/needed.
- Master Data Management - Authoring and matching survivorship, including Data Stewardship.
How has it helped my organization?
It allows our customers to master and expand their products to an international scale. In addition, it enables customers to consolidate multiple, disparate sources of data into a centralized, master data hub which can used for operations or analytics.
What is most valuable?
The solution enables robust data matching, merging, survivorship, and Data Stewardship that can be a part of data quality workflows or true master data management.
What needs improvement?
Needs integrated data governance in terms of dictionaries, glossaries, data lineage, and impact analysis. It also needs operationalization of meta data.
For how long have I used the solution?
Three to five years.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Junior ETL Developer at a marketing services firm with 51-200 employees
Heap space issues plague us consistently. However, the file fetch process is impeccable.
Pros and Cons
- "The file fetch process is impeccable."
- "We are able to get emails from URLs very easily using this function when others fail."
- "tLogRows are also great for finding bad data."
- "NullPointerExceptions are going to be the death of me and are a big reason for our transition away from Talend. One day, it is fine with a 1000 blank rows, then the next day, it will find one blank cell and it breaks down."
- "Heap space issues plague us consistently. We maxed it out and it runs fine, then it doesn’t, then it does."
- "Finding assistance with issues can be spotty. With Python, there are literally millions of open source answers which are recent and apply to the version that we are using."
What is our primary use case?
We are a marketing and advertising company. We use this tool to fetch data from Google, Bing, and Adobe. We receive marketing data daily via email, FTP, and API, then process the data into MySQL tables.
How has it helped my organization?
Coming into the department with no knowledge of Talend, the interface has been user-friendly enough to allow me to come up to speed in four to five months on almost all its functions and use it like a pro.
What is most valuable?
- The file fetch process is impeccable.
- We are able to get emails from URLs very easily using this function when others fail.
- tLogRows are also great for finding bad data.
What needs improvement?
NullPointerExceptions are going to be the death of me and are a big reason for our transition away from Talend. One day, it is fine with a 1000 blank rows, then the next day, it will find one blank cell and it breaks down. When we are dealing with millions of rows of data, this can be super hard to find.
Heap space issues also plague us consistently. We maxed it out and it runs fine, then it doesn’t, then it does.
Finding assistance with issues can be spotty. With Python, there are literally millions of open source answers which are recent and apply to the version that we are using.
Inconsistency is a big issue.
For how long have I used the solution?
Three to five years.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Technical Team Lead at a pharma/biotech company with 1,001-5,000 employees
Although we faced memory issues with 3GB of RAM, I would recommend this product.
What is most valuable?
JRules, TMap, TParallel, ELT, etc
How has it helped my organization?
It has provided the feature wherein the business could make the changes as requested without performing the ETL deployment code to production.
What needs improvement?
I think the memory issues we faced when using the 3GB RAM compared to the 4GB RAM computers caused lot of issues. Probably can improve in that.
For how long have I used the solution?
4 years - Talend Open Studio 3.1.2, 4.1.3, 5.0, Talend Integration Suite 4.1.3, Talend Data Quality 4
What was my experience with deployment of the solution?
Intially we did encountered issues with the deployment, but over the period of time we were able to find the proper way to perform the deployment and also used a tool called HERMES for the deployment.
What do I think about the stability of the solution?
No issues
What do I think about the scalability of the solution?
No issues
How are customer service and technical support?
Customer Service:
Very nice customer service
Technical Support:Excellent support from the technical support team
Which solution did I use previously and why did I switch?
Yes earlier we had Ab Initio but switched to Talend because initially it was an Open Studio with no cost involved and also it was supported by the JRules component.
How was the initial setup?
It was not straight forward as it was pretty new to everyone among our team, but over the period of time when we had hands on the tool everything got smooth.
What about the implementation team?
It was a in-house team.
Which other solutions did I evaluate?
Ab Initio, Informatica etc.
What other advice do I have?
I would definitely recommend others to implement this product as it is really helpful, easy to learn, user friendly, provides lot of enhanced features, etc.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Associate Team Lead at a tech services company with 51-200 employees
We needed to stop manually finding and cleaning data through Excel spreadsheets.
How has it helped my organization?
Data Quality easily identifiable instead of manual finding and cleaning the data through Excel (earlier used to follow) before ETL
What is most valuable?
Currently the best open source data quality tool available as compared to other open DQ tools ('DataCleaner', 'Open Source Data Quality & Profiling') for of a variety of reasons:
- Vast connectors to different DB, Web, CRM, etc
- Custom code is allowed
- Wide range of advanced algorithms
- Recommended for advanced users
- Detailed analysis, etc
- Large community of users
The most valuable features for us are: custom code, connectors, algorithms.
What do I think about the stability of the solution?
As it is a open source tool, some minor bugs are there.
How was the initial setup?
Fairly straightforward. Lots of user guides and tutorials are available to get started.
What's my experience with pricing, setup cost, and licensing?
The best part is that it is open source.
What other advice do I have?
Great product, surely give it a try.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Information Architect at a healthcare company
Good and easy debugging functions while better tools for geo-data are needed.
Valuable Features
Maybe the best thing is the product's easy start-up level when you are familiar with Java. Also job creation is fast compared to some other tools. One more good thing is that tables' metadata is easy to bring into the tool and utilize. Last thing to mention here is flexibility to use Java code inside the job.
Improvements to My Organization
These are: fast job creation from start to finish which improves ROI, good and easy debugging functions.
Room for Improvement
First, We faced problems with stability of the products. Also some components were clearly not tested well, which meant that there were bugs. Better tools for geo-data are needed. Documentation was poor in the beginning but it got better over time.
Use of Solution
Talend Enterprise Data Integration 5.1 (1) and Talend Platform for
Data Services (2)
2 years by one customer (without Data Quality (1)), 6 months in other customer (with Data Quality(2))
Deployment Issues
At the customer deployment to the production environment from the test one was a bit exhausting. This could be because they didn't use/know the best-practices.
Stability Issues
Yes we had issues. Quite often the server needed rebooting as if there were memory leaks. Sometimes the CVS version management got stuck.
Scalability Issues
No issues. Only issues were with the Java memory which is scalable and changeable from the job settings.
Customer Service and Technical Support
Customer Service:
Customer service was good most of the time. Answers came in a timely fashion.
Technical Support:It was good most of the time. Answers came in a timely fashion.
Initial Setup
It was pretty straightforward. Memory settings by the client needed some modification in the first place. From the server point of view I cannot say.
Implementation Team
In house team.
Other Solutions Considered
Yes. We evaluated IBM DataStage.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Technical Consultant
Provides a flexible development environment to the coder
Pros and Cons
- "It has definitely streamlined certain processes."
- "Provides a flexible development environment to the coder."
- "The ability to change the code when debugging the JavaScript could be improved."
What is our primary use case?
Data migration (database to database using direct DB access and commands or using web services).
How has it helped my organization?
It has definitely streamlined certain processes.
What is most valuable?
The ability to build the interface using clear components and access the code (Java) to validate and trace any error. The wide range of components which suits a variety of purposes and provides a flexible development environment to the coder.
What needs improvement?
The ability to change the code when debugging the JavaScript could be improved.
For how long have I used the solution?
One to three years.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Buyer's Guide
Download our free Talend Data Quality Report and get advice and tips from experienced pros
sharing their opinions.
Updated: December 2024
Popular Comparisons
Informatica Intelligent Data Management Cloud (IDMC)
SAS Data Management
Ataccama ONE Platform
Informatica Cloud Data Quality
Experian Data Quality
Melissa Data Quality
Precisely Trillium
Buyer's Guide
Download our free Talend Data Quality Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions: