BigQuery is a powerful tool for managing and analyzing large datasets. The versatility of BigQuery extends to its compatibility with external data visualization tools like Power BI and Tableau. This means you not only get query results but can also seamlessly integrate and visualize your data for better insights.
Data Engineering and AI Intern at .3Lines Venture Capital
Good solution for large databases that require a lot of analytics
Pros and Cons
- "BigQuery is a powerful tool for managing and analyzing large datasets. The versatility of BigQuery extends to its compatibility with external data visualization tools like Power BI and Tableau. This means you not only get query results but can also seamlessly integrate and visualize your data for better insights."
- "Some of the queries are complex and difficult to understand."
What is our primary use case?
What is most valuable?
The product's most valuable feature is its ability to connect to visualization tools.
What needs improvement?
Some of the queries are complex and difficult to understand.
For how long have I used the solution?
I have been using the product for more than a year.
Buyer's Guide
BigQuery
November 2024
Learn what your peers think about BigQuery. Get advice and tips from experienced pros sharing their opinions. Updated: November 2024.
814,649 professionals have used our research since 2012.
What do I think about the scalability of the solution?
My company has 100 users for BigQuery.
How are customer service and support?
The tool's support is fast to respond.
How would you rate customer service and support?
Positive
How was the initial setup?
The tool's deployment is easy if you follow Google's documentation.
What other advice do I have?
If you have a big database and lots of analytics, BigQuery is a really good tool. It helps save and manage your queries and gives you results you can show clients and others. I rate it a nine out of ten.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Data Engineer at a wellness & fitness company with 51-200 employees
Efficient data warehouse solution for analytics and large-scale data processing with exceptional speed and user-friendly interface
Pros and Cons
- "The interface is what I find particularly valuable."
- "It would be beneficial to integrate additional tools, particularly from a business intelligence perspective."
What is our primary use case?
In our workflow, we initiate the process by fetching data, followed by a preprocessing step to refine the data. We establish pipelines for seamless data flow. The ultimate objective is to transfer this processed data into BigQuery tables, enabling other teams, such as analytics or machine learning, to easily interpret and utilize the information for various purposes, whether it's gaining insights or developing models.
How has it helped my organization?
The primary advantages include its speed, especially when dealing with large datasets or big data. It proves exceptionally useful in handling substantial amounts of data efficiently. A notable benefit is the ability to preview data without executing full queries, saving time and allowing for quick insights. This feature eliminates the need to run extensive queries solely for data preview purposes, streamlining the overall workflow.
What is most valuable?
The interface is what I find particularly valuable. When crafting queries, it offers estimations on data usage, providing a helpful indication of resource consumption. This predictive capability adds an extra layer of convenience, making the querying process more insightful and efficient.
What needs improvement?
It would be beneficial to integrate additional tools, particularly from a business intelligence perspective. For instance, incorporating machine learning capabilities could enable users to automatically generate SQL queries.
For how long have I used the solution?
I have been working with it for over a year now.
What do I think about the stability of the solution?
I find it to be generally high and satisfactory. However, there is a notable issue we've encountered regarding query limitations at the organization level.
What do I think about the scalability of the solution?
It is scalable up to a certain point. There seems to be a restriction on the number of queries one can run, for example, being limited to processing ten terabytes of queries. Exceeding this limit results in an inability to run additional queries, posing a potential challenge. Resolving this limitation could contribute to a smoother user experience. Currently, the user base exceeds two hundred individuals.
Which solution did I use previously and why did I switch?
We used Google Cloud Storage, IAM, AWS (specifically VPC), and instances from both AWS and Google Cloud Platform. Regarding comparison with other solutions, particularly AWS, there are notable observations. AWS, being introduced earlier, appears to have more extensive features compared to Google Cloud Platform (GCP). AWS enjoys the advantage of having a more established history, resulting in robust support from their team. It offers a more comprehensive platform with a broader range of features, and its pricing structure appears to be more favorable.
How was the initial setup?
The challenging part lies in the initial setup of the project, especially when integrating with project management tools. When establishing a project on the Google Cloud Platform, you need to navigate through various resources.
What about the implementation team?
Setting up the account, whether at an individual or organizational level, involves providing necessary information, including credit card details for billing purposes. Once the account is set up, accessing resources like Cloud Storage or BigQuery becomes straightforward within the Google Cloud Platform.
What other advice do I have?
For those venturing into cloud platforms, especially at an individual level, I would recommend considering AWS. Given its longer establishment in the industry, many companies utilize AWS. Additionally, both AWS and GCP offer free tiers for new users, but AWS extends this benefit to one year, while GCP limits it to three months. At the organizational level, AWS tends to provide more extensive features compared to GCP, making it a preferable choice. Overall, I would rate it eight out of ten.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Buyer's Guide
BigQuery
November 2024
Learn what your peers think about BigQuery. Get advice and tips from experienced pros sharing their opinions. Updated: November 2024.
814,649 professionals have used our research since 2012.
Sr Manager at a transportation company with 10,001+ employees
Everything they advertised worked exactly as promised
Pros and Cons
- "We basically used it to store server data and generate reports for enterprise architects. It was a valuable tool for our enterprise design architect."
- "I would like to see version-based implementation and a fallback arrangement for data stored in BigQuery storage. These are some features I'm interested in."
What is our primary use case?
We basically used it to store server data and generate reports for enterprise architects. It was a valuable tool for our enterprise design architect.
What is most valuable?
Everything they advertised or listed worked exactly as promised. That was advantageous to us.
What needs improvement?
In future releases, I would like to see more pre-defined aggregated forms. After using BigQuery, we need to use the data in an enterprise architecture dimensional data model. So, having pre-defined aggregated forms would be helpful.
Additionally, I would like to see version-based implementation and a fallback arrangement for data stored in BigQuery storage. These are some features I'm interested in.
For how long have I used the solution?
I have experience with BigQuery.
What about the implementation team?
When I joined the company, BigQuery was already implemented by our team.
What's my experience with pricing, setup cost, and licensing?
It is a cheap solution.
What other advice do I have?
I would recommend getting a clear understanding of BigQuery's functionalities and what it's best suited for. If your needs align with its capabilities, then you should definitely proceed.
BigQuery offers fantastic features, but it's important to understand its purpose beforehand. Otherwise, you might face difficulties later on.
Overall, I would rate the solution an eight out of ten.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Google
Disclosure: I am a real user, and this review is based on my own experience and opinions.
IT Consultant at 18months
A serverless, scalable and cost-efficient data warehouse solution with seamless integration, real-time analytics, and advanced machine-learning capabilities
Pros and Cons
- "It stands out in efficiently handling internal actions without the need for manual intervention in tasks like building cubes and defining final dimensions."
- "The primary hurdle in this migration lies in the initial phase of moving substantial volumes of data to cloud-based platforms."
What is our primary use case?
We have a cloud solution that runs in a centralized mode for a few hundred senior managers who require diverse reports, ranging from daily operational details to more substantial analyses, such as sales trends, movie ticket sales clustering, and reporting.
What is most valuable?
The flexibility of its serverless architecture is advantageous in handling the variable nature of our workloads. Instead of relying on a fixed database cluster with constant costs, it allows you to pay for the resources you consume during peak times. This on-demand pricing model appears to be more cost-effective, particularly when dealing with occasional heavy queries that involve analyzing billions of data points, such as ticket sales for millions of movies. The ability to scale internally using Kubernetes adds another layer of flexibility to our setup, allowing us to adapt to varying demands efficiently. Its fast response times during peak usage make it a suitable choice for our dynamic and variable data processing needs. I appreciate its impressive optimization and automation features, observed during small-scale tests. It stands out in efficiently handling internal actions without the need for manual intervention in tasks like building cubes and defining final dimensions.
What needs improvement?
The primary hurdle in this migration lies in the initial phase of moving substantial volumes of data to cloud-based platforms. This becomes even more pronounced when dealing with terabytes of data. Uploading data to cloud services requires careful consideration and optimization to ensure a smooth and efficient migration, especially when dealing with large datasets.
For how long have I used the solution?
I started using it recently.
What do I think about the scalability of the solution?
It inherently manages scalability with its auto-scaling capabilities. The ability to dynamically adjust resources based on demand is a key factor in optimizing performance and ensuring that our system can handle varying workloads efficiently. We operate as a small company with a modest business scale, handling a few medium-sized projects each year.
How was the initial setup?
The current bottleneck in our migration process primarily revolves around bandwidth issues, especially during the initial data ingestion phase.
What about the implementation team?
The deployment process itself is straightforward and not a source of concern. The real challenge lies in the bandwidth limitations and the time-consuming nature of data uploading. While a comprehensive evaluation is still pending, it's anticipated that the data upload alone might take up to a week or more.
What's my experience with pricing, setup cost, and licensing?
The pricing appears to be competitive for the intended usage scenarios we have in mind.
Which other solutions did I evaluate?
In my evaluation of alternative solutions, I'm exploring Hydra, a columnar version of Postgres with partitioning capabilities. While I'm still learning about its features and performance, it seems promising. Additionally, I'm considering ClickHouse, which has shown exceptional benchmark results. I've completed an initial installation to assess its functionality.
What other advice do I have?
Overall, I would rate it eight out of ten.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Data Engineer at a financial services firm with 10,001+ employees
A fully-managed, serverless data warehouse with good storage and unlimited table length
Pros and Cons
- "The main thing I like about BigQuery is storage. We did an on-premise BigQuery migration with trillions of records. Usually, we have to deal with insufficient storage on-premises, but in BigQuery, we don't get that because it's like cloud storage, and we can have any number of records. That is one advantage. The next major advantage is the column length. We have some limits on column length on-premises, like 10,000, and we have to design it based on that. However, with BigQuery, we don't need to design the column length at all. It will expand or shrink based on the records it's getting. I can give you a real-life example based on our migration from on-premises to GCP. There was a dimension table with a general number of records, and when we queried that on-premises, like in Apache Spark or Teradata, it took around half an hour to get those records. In BigQuery, it was instant. As it's very fast, you can get it in two or three minutes. That was very helpful for our engineers. Usually, we have to run a query on-premises and go for a break while waiting for that query to give us the results. It's not the case with BigQuery because it instantly provides results when we run it. So, that makes the work fast, it helps a lot, and it helps save a lot of time. It also has a reasonable performance rate and smart tuning. Suppose we need to perform some joins, BigQuery has a smart tuning option, and it'll tune itself and tell us the best way a query can be done in the backend. To be frank, the performance, reliability, and everything else have improved, even the downtime. Usually, on-premise servers have some downtime, but as BigQuery is multiregional, we have storage in three different locations. So, downtime is also not getting impacted. For example, if the Atlantic ocean location has some downtime, or the server is down, we can use data that is stored in Africa or somewhere else. We have three or four storage locations, and that's the main advantage."
- "It would be better if BigQuery didn't have huge restrictions. For example, when we migrate from on-premises to on-premise, the data which handles all ebook characters can be handled on-premise. But in BigQuery, we have huge restrictions. If we have some symbols, like a hash or other special characters, it won't accept them. Not in all cases, but it won't accept a few special characters, and when we migrate, we get errors. We need to use Regexp or something similar to replace that with another character. This isn't expected from a high-range technology like BigQuery. It has to adapt all products. For instance, if we have a TV Showroom, the TV symbol will be there in the shop name. Teradata and Apache Spark accept this, but BigQuery won't. This is the primary concern that we had. In the next release, it would be better if the query on the external table also had cache. Right now, we are using a GCS bucket, and in the native table, we have cache. For example, if we query the same table, it won't cost because it will try to fetch the records from the cached result. But when we run queries on the external table a number of times, it won't be cached. That's a major drawback of BigQuery. Only the native table has the cache option, and the external table doesn't. If there is an option to have an external table for cache purposes, it'll be a significant advantage for our organization."
What is our primary use case?
We use BigQuery to store data in a table and query it. Data storage can be either an internal native table or an external table where the external source will point to Google Cloud Storage or Google Drive.
Wherever we can have external storage, we can have a table built pointing to that external storage and query the tables. In BigQuery, we can query the table or even do DML operations, like insert, delete, etc.
What is most valuable?
The main thing I like about BigQuery is storage. We did an on-premise BigQuery migration with trillions of records. Usually, we have to deal with insufficient storage on-premises, but in BigQuery, we don't get that because it's like cloud storage, and we can have any number of records. That is one advantage.
The next major advantage is the column length. We have some limits on column length on-premises, like 10,000, and we have to design it based on that. However, with BigQuery, we don't need to design the column length at all. It will expand or shrink based on the records it's getting.
I can give you a real-life example based on our migration from on-premises to GCP. There was a dimension table with a general number of records, and when we queried that on-premises, like in Apache Spark or Teradata, it took around half an hour to get those records. In BigQuery, it was instant. As it's very fast, you can get it in two or three minutes. That was very helpful for our engineers.
Usually, we have to run a query on-premises and go for a break while waiting for that query to give us the results. It's not the case with BigQuery because it instantly provides results when we run it. So, that makes the work fast, it helps a lot, and it helps save a lot of time.
It also has a reasonable performance rate and smart tuning. Suppose we need to perform some joins, BigQuery has a smart tuning option, and it'll tune itself and tell us the best way a query can be done in the backend.
To be frank, the performance, reliability, and everything else have improved, even the downtime. Usually, on-premise servers have some downtime, but as BigQuery is multiregional, we have storage in three different locations. So, downtime is also not getting impacted.
For example, if the Atlantic ocean location has some downtime, or the server is down, we can use data that is stored in Africa or somewhere else. We have three or four storage locations, and that's the main advantage.
What needs improvement?
It would be better if BigQuery didn't have huge restrictions. For example, when we migrate from on-premises to on-premise, the data which handles all ebook characters can be handled on-premise. But in BigQuery, we have huge restrictions. If we have some symbols, like a hash or other special characters, it won't accept them. Not in all cases, but it won't accept a few special characters, and when we migrate, we get errors.
We need to use Regexp or something similar to replace that with another character. This isn't expected from a high-range technology like BigQuery. It has to adapt all products. For instance, if we have a TV Showroom, the TV symbol will be there in the shop name. Teradata and Apache Spark accept this, but BigQuery won't. This is the primary concern that we had.
In the next release, it would be better if the query on the external table also had cache. Right now, we are using a GCS bucket, and in the native table, we have cache. For example, if we query the same table, it won't cost because it will try to fetch the records from the cached result. But when we run queries on the external table a number of times, it won't be cached. That's a major drawback of BigQuery. Only the native table has the cache option, and the external table doesn't. If there is an option to have an external table for cache purposes, it'll be a significant advantage for our organization.
For how long have I used the solution?
I have been using BigQuery for more than three years.
What do I think about the stability of the solution?
BigQuery is a stable solution.
What do I think about the scalability of the solution?
BigQuery is highly scalable. We can have unlimited storage if we do 20 records, and It's very fast. Even if we scale it to 20 trillion, it will still be fast.
In my organization, about two in five use BigQuery. When I joined the company a year back, usage was relatively moderate. However, now usage increased because of the on-premise to GCP migration. Because of many successful projects, several people are using BigQuery now.
How are customer service and support?
We have dedicated support people who help us with the framework. If there is a technical issue in BigQuery, we just get help from the technical team. But if there are any engineering issues or some data issues, our team will handle them.
Which solution did I use previously and why did I switch?
I use Teradata and then Apache Spark on-premises.
How was the initial setup?
The initial setup is relatively straightforward. There are some restrictions, like the project's name. It has to be unique, but once that project is created, we can simply go to an option, query, and the query control will open, and we can start creating a table, loading data, querying, and everything. So that's quite simple and straightforward.
What about the implementation team?
When I joined PayPal, the setup was done in-house. When I worked at another organization, Cognizant, we had Google's help. So a Google specialist helped us set up and everything.
What's my experience with pricing, setup cost, and licensing?
I have tried my own setup using my Gmail ID, and I think it had a $300 limit for free for a new user. That's what Google is offering, and we can register and create a project.
What other advice do I have?
On a scale from one to ten, I would give BigQuery an eight.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Google
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Senior Cyber Security Architect Global ICT at a construction company with 10,001+ employees
A stable solution with out-of-the-box capabilities that can be used for analytics and reporting
Pros and Cons
- "The solution's reporting, dashboard, and out-of-the-box capabilities match exactly our requirements."
- "As a product, BigQuery still requires a lot of maturity to accommodate other use cases and to be widely acceptable across other organizations."
What is our primary use case?
We use BigQuery for analytics and reporting.
What is most valuable?
The most valuable feature of BigQuery is its capability to integrate. The product fits pretty well within our ecosystem. The solution's reporting, dashboard, and out-of-the-box capabilities match exactly our requirements.
What needs improvement?
As a product, BigQuery still requires a lot of maturity to accommodate other use cases and to be widely acceptable across other organizations. It's not as old as other applications like Tableau or Power BI, but as long as it's supported by Google, I think it will continue to progress.
For how long have I used the solution?
I have been working with BigQuery for about two years.
What do I think about the stability of the solution?
BigQuery's stability is good. I rate BigQuery a nine out of ten for stability.
What do I think about the scalability of the solution?
We have tested and found that BigQuery's scalability is good. I rate BigQuery a seven to eight out of ten for scalability.
How was the initial setup?
BigQuery's initial was simple because it's provided over the cloud.
What other advice do I have?
BigQuery is suitable for all sorts of business types. Medium and small businesses will find the solution's out-of-the-box use cases more useful.
Overall, I rate BigQuery an eight out of ten.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Team Lead Data & Analytics at a hospitality company with 501-1,000 employees
Good performance, not too expensive, and user-friendly
Pros and Cons
- "It has a well-structured suite of complimentary tools for data integration and so forth."
- "When it comes to queries or the code being executed in the data warehouse, the management of this code, like integration with the GitHub repository or the GitLab repository, is kind of complicated, and it's not so direct."
What is our primary use case?
This is a cloud-based data warehouse.
What is most valuable?
The product is updated automatically without people having to worry about doing anything. It is managed completely by Google.
The performance is good. It's very user-friendly for people not coming from the technical area.
It has a very friendly user interface and a console for command line.
It has a well-structured suite of complimentary tools for data integration and so forth.
What needs improvement?
When it comes to queries or the code being executed in the data warehouse, the management of this code, like integration with the GitHub repository or the GitLab repository, is kind of complicated, and it's not so direct. When people are working on long queries, and so forth, they have to save them. It is a little bit clunky. The interface for saving them and version control is not really doable. We have to support the queries manually.
For how long have I used the solution?
I've used the solution across different companies. I've used it for about six or seven years.
What's my experience with pricing, setup cost, and licensing?
In my previous company, we were not spending that much. You give more money away to the other tools from GCP. We paid maybe €200 or something like that and no more than that. This year, we pay €170 a month.
What other advice do I have?
We are an end-user.
The product is a software as a service, and therefore, we are always on the latest version. They do everything for us.
I'd rate the product eight out of ten as it's a very good data warehouse, and it's very easy to learn how to use it. It's very user-friendly. I can have my team handle it, even if they are non-technical and they can be doing a lot of coding there without problems.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Google
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Senior Principal Architect at a real estate/law firm with 5,001-10,000 employees
A NoSQL framework where you can scale queries to petabytes of data
Pros and Cons
- "The query tool is scalable and allows for petabytes of data."
- "The solution hinges on Google patterns so continued improvement is important."
What is our primary use case?
Our company uses the solution as a data warehouse for implementing machine learning use cases and queries.
What is most valuable?
The query tool is scalable and allows for petabytes of data.
The NoSQL model and feeds for machine learning are based on the support of competent technologies.
The solution includes plenty of additional features.
What needs improvement?
The solution hinges on Google patterns so continued improvement is important.
For how long have I used the solution?
I have been using the solution for two years.
What do I think about the stability of the solution?
The solution is stable.
What do I think about the scalability of the solution?
The solution is scalable and we have 200 users with no issues.
How are customer service and support?
Google has one technical support channel for all products and services. If you place a support ticket, they will respond to you in order of priority.
How was the initial setup?
There is no setup because the solution resides in the cloud. Once you enable the APIs in the Google Cloud ecosystem, you can start consuming right away.
What's my experience with pricing, setup cost, and licensing?
The price is a bit high but the technology is worth it. If you do not use the solution in the right way, it will be expensive.
Which other solutions did I evaluate?
There is not an equivalent competitor product because the solution is Google's proprietary technology.
What other advice do I have?
If you are interested in a NoSQL option, definitely try the solution.
I rate the solution a ten out of ten.
Which deployment model are you using for this solution?
Private Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Google
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Buyer's Guide
Download our free BigQuery Report and get advice and tips from experienced pros
sharing their opinions.
Updated: November 2024
Product Categories
Cloud Data WarehousePopular Comparisons
Snowflake
Teradata
Microsoft Azure Synapse Analytics
Oracle Exadata
Vertica
VMware Tanzu Data Solutions
Dremio
AWS Lake Formation
Apache Hadoop
Oracle Autonomous Data Warehouse
IBM Netezza Performance Server
IBM Db2 Warehouse
SAP Business Warehouse
Yellowbrick Cloud Data Warehouse
Buyer's Guide
Download our free BigQuery Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions:
- Which ETL or Data Integration tool goes the best with Amazon Redshift?
- What are the main differences between Data Lake and Data Warehouse?
- What are the benefits of having separate layers or a dedicated schema for each layer in ETL?
- What are the key reasons for choosing Snowflake as a data lake over other data lake solutions?
- Are there any general guidelines to allocate table space quota to different layers in ETL?
- What cloud data warehouse solution do you recommend?
- Can you please help me understand cloud databases?
- bitmap index as preferred choice in data warehousing environment
- Why do you recommend using a cloud data warehouse?