Let's learn about Data Analytics via these 402 free blog posts. They are ordered by HackerNoon reader engagement data. Visit the /Learn or LearnRepo.com to find the most read blog posts about any technology.
No matter the project, data analytics are a must.
1. Import JSON To Google Sheets - 3 Best Ways To Do It
3 ways to pull JSON data into a Google Spreadsheet
2. 10 Best Datasets for Time Series Analysis
In order to understand how a certain metric varies over time and to predict future values, we will look at the 10 Best Datasets for Time Series Analysis.
3. How To Import External Data Into Google Sheets Without Copy/Paste
Learn how to save time and eliminate manual data imports in Google Sheets by automatically connecting and importing data from external sources.
4. Effective Workarounds for SQL-Style Joins in Elasticsearch
In this blog, we explore how nested objects and parent-child relationships enable SQL-like join operations in Elasticsearch.
5. How to Build a Data-Driven Product Using Metabase
Metabase is a business intelligence tool that lets you access your data in a read-only manner.
6. Advantages and Disadvantages of Big Data
Big data may seem like any other buzzword in business, but it’s important to understand how big data benefits a company and how it’s limited.
7. LanceDB: Your Trusted Steed in the Duel Against Data Complexity
There are many ways to build on the foundation offered in this tutorial to create performant, scalable and future-proofed ML/AI architectures.
8. DynamoDB Filtering and Aggregation Queries Using SQL on Rockset
How to use DynamoDB and Rockset together to build a fast, delightful application experience for users.
9. AWS Redshift vs Snowflake: A Comprehensive Guide to Embedded Analytics Solutions
Discover the importance of embedded analytics within SaaS applications and the critical role of data warehousing solutions like AWS Redshift and Snowflake.
10. How Custom Data Models Drive Next-Generation Embedded Analytics
Learn how custom data models drive impactful embedded analytics within SaaS applications and deliver custom experiences for users and providers alike.
11. Scalable and Secure Data-Driven Application With Multi-Tenant Databases and Embedded Analytics
Multi-tenant databases and embedded analytics intersect to securely scale applications and provide real-time analytics.
12. Crypto Growth: Creating Effective User Personas
Discover the importance of creating effective user personas for crypto companies. Learn how to profile your audience, address their needs, and enhance UX
13. 12 Best Pre-Installed R Datasets Commonly Used for Statistical Analysis
R programming is mostly used in statistical analysis and ML.
This article looks at the Best Pre-Installed R Datasets Commonly Used for Statistical Analysis.
14. Synthetic Data And Its Potential In Healthcare
Synthetic data represents a paradigm shift in healthcare because it allows data to transcend its potential shortcomings.
15. Building a Data Analytics Platform to Streamline the Temporary Labor Sector
How Maruti Techlabs developed a cutting-edge data analytics platform that optimized workforce management, reduced costs, and maximized productivity.
16. Scraping Google Search Results With Node JS
In this post, we will learn web scraping Google with Node JS using some of the in-demand web scraping and web parsing libraries present in Node JS.
17. Eliminating Difference Between Business Intelligence analysts, Data Analysts or Data Scientists 🚀
There was a time when the data analyst on the team was the person driving digitalization in an adventurous data quest...and then the engineers took over.
18. Simulating Infectious Disease Spread with Python: SIR and SEIR Models
Explore disease modeling using Python with the SIR and SEIR models. Learn how to master Python for infectious disease analysis, integrate real data, and assess.
19. Scaling PostgreSQL: How We Tamed 10 Billion Daily Records and 350 TB+ of Data
Read how we used Timescale to scale a 350 TB+ PostgreSQL database to build Insights, our new database observability tool.
20. How GPUs are Beginning to Displace Clusters for Big Data & Data Science
More recently on my data science journey I have been using a low grade consumer GPU (NVIDIA GeForce 1060) to accomplish things that were previously only realistically capable on a cluster - here is why I think this is the direction data science will go in the next 5 years.
21. Size Does Matter: Global Control Group for a Bank
Learn how to approach data-driven measurement properly. See what unexpected results we got in a bank and get insights for your own data analytics journey.
22. Your Definitive Guide to Lakehouse Architecture with Iceberg and MinIO
This post focuses on how Iceberg and MinIO complement each other and how various analytic frameworks (Spark, Flink, Trino, Dremio, Snowflake) can leverage them.
23. From Materialized Views to Continuous Aggregates: Enhancing PostgreSQL With Real-Time Analytics
Discover how PostgreSQL's materialized views have evolved into dynamic, real-time analytical tools called continuous aggregates.
24. Combining Delta Lake With MinIO for Multi-Cloud Data Lakes
The combination of MinIO and Delta Lake enables enterprises to have a multi-cloud data lake that serves as a consolidated single source of truth.
25. How to Track User Navigation Events in a React Application
A scalable and maintainable strategy for tracking page navigation events in a React application.
26. The New Private Cloud From the Eyes of an Architect
The term “private cloud” used to have a negative connotation, but is now viewed a lot more positively.
27. How Drones Are Transforming Big Data Analytics
The world is transforming right before our eyes. We’ve heard about drones for a long time now, especially with big companies like Amazon using them for more efficient package delivery, a major trend in modern e-commerce. Instead of your local delivery man, a drone may drop a package right on your doorstep. The true power of drones goes well beyond that, though. They provide businesses with data that’s difficult to collect otherwise. In addition to taking aerial photos and videos, drones can collect information about everything from the health of crops to thermal leaks in buildings.
28. How To Deploy Metabase on Google Cloud Platform (GCP)?
Metabase is a business intelligence tool for your organisation that plugs in various data-sources so you can explore data and build dashboards. I'll aim to provide a series of articles on provisioning and building this out for your organisation. This article is about getting up and running quickly.
29. Unlocking the Power of Data Lakes for Embedded Analytics in Multi-Tenant SaaS
Discover why data lakes are superior to traditional data warehouses for embedded analytics in SaaS applications.
30. Creating a Data Lakehouse using Apache Iceberg and MinIO
The promise of Data Lakehouses is in their capabilities for structured and unstructured data, all in a centralized solution using Apache Iceberg and MinIO.
31. How to Create a Simple Web Dashboard for Efficient Data Analytics
Dashboard with different visualizations allows you to compare data and show changes and tendencies. In this tutorial I wil explain why and how to build one.
32. Postgres TOAST: Understanding the Data Compression Mechanism and Its Limitations
Discover the challenges of PostgreSQL's traditional TOAST mechanism for data compression and storage optimization.
33. Finding Digital Crimes by Exploring Master File Table (MFT) Records
To explore the MFT records, learn how to locate date and time values in the metadata of a file we create.
34. Installing and Configuring Kubeflow with MinIO Operator
Kubeflow is a modern solution to design, build and orchestrate Machine Learning pipelines using the latest and most popular frameworks.
35. When A/B Tests Aren’t Possible, Causal Inference Can Still Measure Marketing Impact
Learn how to measure marketing impact without A/B tests using causal inference, Diff-in-Diff, synthetic control, and GeoLift.
36. 8 Best Human Behaviour Datasets for Machine Learning
Human behaviour describes how people interact and in this article, we will look at the 8 Best Human Behaviour Datasets for Machine Learning.
37. 5 Main Uses of Generative AI in Business Intelligence & Data Analytics
In this article, we’ll explore 5 main use cases of generative AI in business intelligence and data analytics and how real companies are making use of it.
38. My Favorite Free Excel Courses for Programmers, Data Analysts, and IT Professionals
If you want to learn Microsoft Excel, a productivity tool for IT professionals, and looking for free online courses, then you have come to the right place.
39. AB Testing on Small Sample Sizes with Non-Normal Distributions
In this article, we will explore the intricacies of AB testing on small sample sizes, which can be valuable in B2B settings or products with a limited user base
40. How to Scrape Data from Google Maps
Want to scrape data from Google Maps? This tutorial shows you how to do it.
41. Using Elasticsearch to Offload Search and Analytics from DynamoDB: Pros and Cons
While Elasticsearch is known for being flexible and highly customizable, it is a complex distributed system that requires cluster and index operations.
42. What the Heck Is Malloy?
Malloy is a new experimental language for describing data relationships and transformations created by the developer of Looker.
43. Probabilistic Predictions in Classification - Evaluating Quality
Binary classification is one of the most common machine learning tasks. In practice, the goal of such tasks often extends beyond simply predicting a class.
44. 4 Elasticsearch Performance Challenges and How to Solve Them
Solutions to common Elasticsearch performance challenges at scale including slow indexing, search speed, shard and index sizing, and multi-tenancy.
45. Elasticsearch Updates, Inserts, Deletes: Understanding How They Work and Their Limitations
For a system like Elasticsearch, engineers need to have in-depth knowledge of the underlying architecture in order to efficiently ingest streaming data.
46. 5 Best Practices for Tracking In-app Event Data
This is the era of mobile apps. We get everything - from critical business information to entertaining videos and games - on our mobile devices. Information is right at our fingertips, and we are always striving to catch up with the outside world. As per App Annie, an average smartphone user has 80 apps installed.
47. How to Use Propensity Score Matching to Measure Down Stream Causal Impact of an Event
How can we know ours ads are making impact that we aim for? What if targeted ads are not working the way we want them to?
48. Polygon data: What it is and how can it be used?
This blog explains about polygon data, its benefits and how it is widely used in geomarketing, indoor mapping, and mobility analysis for orgnaizations.
49. Meet New & Improved BigQuery: Single, Unified AI-Ready Data Platform
Google has gone a step further and unified key data Google Cloud analytics capabilities under BigQuery - now the single, AI-ready data analytics platform.
50. Commercial Analytics
I'll share insights into how we can uncover untapped potential in pricing, assortment management, and stock logistics with data-based instruments and processes.
51. Top 8 Best Qlik Sense Extensions
Qlik Sense is powerful data visualization and BI software. But sometimes its functions are not enough. Meet the best Qlik Sense extensions to do more with data!
52. Secrets to Growth Marketing Data Engineering – Even in This Down Economy
Marketing is a big business and it's only going to grow bigger. One reason for this is that marketers need to keep growing the list of data points.
53. How to Think Like a Data Scientist or Data Analyst
Data science is a new and maturing field, with a variety of job functions emerging, from data engineering and data analysis to machine and deep learning. A data scientist must combine scientific, creative and investigative thinking to extract meaning from a range of datasets, and to address the underlying challenge faced by the client.
54. Comparing Meilisearch and Manticore Search Using Key Benchmarks
Both Manticore and Meilisearch position themselves as full-text search engines. The key element in full-text search engines is how they rank documents.
55. Pandas vs Polars in 2025: Choosing the Best Python Tool for Big Data
Comparing Pandas and Polars
56. Building A Log Analytics Solution 10 Times More Cost-Effective Than Elasticsearch
There exist two common log processing solutions within the industry, exemplified by Elasticsearch and Grafana Loki, respectively.
57. Building Engaging Real-Time Data Visualizations In React With Highcharts
Learn to create real-time charts in React with Highcharts and WebSocket. Includes setup guide, code examples, and GitHub repo for quick experimentation.
58. Data-Driven Validation for Business Ideas: A Step-by-Step Guide
Unlock the potential of data-driven validation for your side project. Discover how utilizing data insights drives informed decision-making and save some grief!
59. Mean Reversion Trading Systems and Cryptocurrency Trading [A Deep Dive]
Prices move in a wave like fashion, moving back and forth following a broader trend. While doing so, it often revolves around a mean. It might move across or bounce off the mean. Mean reversion systems are designed to exploit this tendency.
60. The Top Big Data Consulting Firms
Thanks to big data, today an organization can quickly obtain the necessary information from an unordered data set and deploy it effectively. The growing popularity of big data analytics has led to a significant increase in the number of companies providing big data solutions and related services.
61. 8-Ways Data Mining Can Improve your Business
If your company is trying to make sense of the customer data, here’s a not-so-surprising fact for you. You aren’t alone. Far too many companies want to understand data and gain an in-depth insight into the information they are sitting on. Let’s be clear that today, the success of a business lies in how efficient their data mining process is. Their expertise to process the available data as this can help them to decipher age-old questions that make or break them:
62. What the Heck is PRQL?
Another clever tool for a powerful SQL pre-processor
63. Statistics Cheat Sheet: A Beginner's Guide to Probability and Random Events
A beginner’s guide to Probability and Random Events. Understand the key statistics concepts and areas to focus on to ace your next data science interview.
64. Creating an Interactive Word Tree Chart with JavaScript
Learn how to create beautiful interactive JavaScript Word Trees and check out an awesome Word Tree chart visualizing the text of The Little Prince.
65. Secure Multi-Party Computation Use Cases
Secure Multi-Party Computation (SMPC), as described by Wikipedia, is a subset of cryptography to create methods for multiple users to jointly compute a function over their inputs while keeping those inputs private. A significant benefit of Secure Multi-Party Computation is that it preserves data privacy while making it usable and open for analysis.
66. How AI Forecasting Helps SMBs Plan Inventory (and Save Costs)
Learn how AI inventory forecasting helps SMBs cut stockouts and excess stock, with a 30-day rollout, KPI tips (MAPE/WAPE), and a workflow you can run in BoxHero
67. Is The Modern Data Warehouse Dead?
Do we need a radical new approach to data warehouse technology? An immutable data warehouse starts with the data consumer SLAs and pipes data in pre-modeled.
68. Customer Data Platform (CDP) Vs Data Warehouse, CRM, and Data Management Platform
In this post, we highlight some key differences between a Customer Data Platform (CDP) and other tools generally used in a marketing tech stack. We also tackle the all-important question on many companies’ minds: “should I build or buy a CDP?.”
69. How to use Redis HyperLogLog
How to use Redis HyperLogLog data structure to store millions of unique items.
70. Unraveling the Maze of Large JSON Files: Tips and Tools for Local JSON Parsing
Discover how a backend developer overcomes obstacles in processing large JSON log files.
71. How to Improve Your Data Literacy Skills
Are you data literate? In today's data-driven world, data literacy is a crucial skill. Here's how you can develop it for yourself.
72. Top 40+ Data Science Product Interview Questions
Find the top 40+ product interview questions you must prepare for your next data science interview.
73. Analysis of Network Graphs: Visualizing Hamilton Characters as a Social Network
Discover how graph theory and data science techniques unlock new insights into character relationships in literature, from Game of Thrones to Hamilton.
74. 9 Big Takeaways From Data Analytics This Year
The world is data-driven, and it has become a necessity to extract meaningful insights from unstructured data.
75. Amazon Is Losing Market Share Across Several Segments to Competitors [A Numbers Game]
You’ve probably read about how Amazon has put a stop to its paid acquisition. We’ve covered the topic extensively already over the past few weeks, and yet, what we’ve recently discovered sheds some light on the magnitude of this move.
76. A Step By Step Guide To Data Visualization With Power BI
Power BI is the collective name for an assortment of cloud-based apps and services that help organizations collate, manage and analyze data from various sources
77. Using Python to Refresh Tableau Dashboard
In most cases, Tableau tasks are scheduled to run at designated times, however, there may be occasions where the flexibility is not sufficient.
78. Create a Custom AI Slack Bot for Streamlined Data Analytics in Natural Language
Organizations are always looking for ways to make their data analysis process more efficient. Here's an open-source Slack bot that does just that.
79. Achieving Mastery: How Analytics Drives Game Balancing and Tuning
Explore how the power of analytics propels game balancing and tuning, unveiling the secrets to achieving mastery in game development.
80. The Science Behind the Fun: Why Game Analytics Matter in Modern Game Development
Uncover the science behind the fun! Explore why game analytics matter in modern development, driving data-driven decisions and optimizing player experiences.
81. How to Modify the Number of Rows Fetched by SAP BusinessObjects Report
If your BO Report exceeds the 5000 rows, you may miss out on critical data or insights.
82. A JavaScript Infographic: Data Science Salaries in 2022
Data visualisation infographic with insights on salary level of data scientists - how to create the JavaScript dashboard and analyse its data
83. LLM-Powered OLAP: the Tencent Experience with Apache Doris
Adopting AI in our data analytic solution is a bumpy journey, but phew, it now works well for us.
84. Be a Shortstop Beagle: Learn How to Update R and RStudio to the Latest Version
Learn how to update your Rstudio open source software and why you should keep it up to date.
85. How to Analyze Anything - Master Data Analysis With ChatGPT (Beginner's Tutorial)
Today, we’re diving into an exciting feature within ChatGPT that has the potential to enhance your productivity by 10, 20, 30, or even 40%.
86. Analyzing Data From U.S. Road Accidents With Data Visualization
In this article, we would be analyzing data related to US road accidents, which can be utilized to study accident-prone locations and influential factors.
87. Common MS Excel Questions to Help you Excel in a Data Analyst Job Interview
EXCEL Interview Questions for Data Analysts
88. A Look into the History and Future of Web Analytics
Today, web analytics are an important part of how millions of businesses operate. Businesses of all sizes and stripes rely on services like Google Analytics to help them understand consumer wants and optimize web experiences for them. Data analytics is a rapidly growing field as well, expected to be worth $550 billion by 2028.
89. The Importance of Sports Analytics
You’re probably familiar with the movie Moneyball (if not, watch it!). It’s the story of Billy Beane, the former MLB player and manager of the Oakland A’s, a struggling team with one of the smallest budgets in the league. Using statistical analysis methods, he ditched all traditional advice and based recruitment purely on data. The result? The A’s won 20 consecutive games, the first team in over a century to do so.
90. Top Tableau Consulting Companies on the Market in 2020
Business intelligence has become an indispensable part of successful businesses, and the sooner executives recognize data as a crucial component of decision-making, the sooner they will be able to improve their operational processes.
91. Building a Point Map in JavaScript
Master creating interactive point maps in JavaScript! Step-by-step guide using millionaire counts for global cities for illustration. Dive in now!
92. Everyone in AI Loves Synthetic Data—But No One Can Agree on What It Is
Understand the 4 types of synthetic data—Imputation, User Creation, Insights Modeling, and Manufactured Outcomes—to enhance AI, analytics, and market research
93. Beyond Artificial Intelligence: Providing Insights to Your Customers
<meta name="monetization" content="$ilp.uphold.com/EXa8i9DQ32qy">
94. 9 Best Data Integration Software in 2022
Every business needs to collect, manage, integrate, and analyze data collected from various sources. Data integration software can help!
95. What is the Future of the Data Engineer? - 6 Industry Drivers
Is the data engineer still the "worst seat at the table?" Maxime Beauchemin, creator of Apache Airflow and Apache Superset, weighs in.
96. Data Playgrounds are The Cure for Slow and Inefficient DataOps
Companies struggle with their DataOps due to a flawed, code-centric, and linear workflow. To succeed, they must build data playgrounds, not mere pipelines.
97. Evolution of The Data Production Paradigm in AI
The long-term success of an AI-based product relies on having the infrastructure for scalable, flexible, and cost-effective data labeling for its learning.
98. Promoting Indigenous Data Sovereignty through Blockchain in Canada
How can Indigenous Sovereignty be upheld with modern technology?
99. Integrate Apache Doris Into Your Data Architecture: Real-time Data Warehousing
A whole-journey guide for financial users looking for fast data processing performance, data security, and high service availability with Apache Doris.
100. Not data-driven: purpose-driven and data-assisted

101. Unveiling Causal Impact: From Theory to Practice
We will guide you through a specific dataset, demonstrating how to implement the library and interpret results.
102. What If Your LLM Is a Graph? Researchers Reimagine the AI Stack
The global knowledge graph market is projected to reach $6.93 billion by 2030.
103. What is Data Analytics and How It Can Be Used
WHAT IS DATA ANALYTICS?
104. How to Achieve Optimal Business Results with Public Web Data
Public web data unlocks many opportunities for businesses that can harness it. Here’s how to prepare for working with this type of data.
105. 4 Tips To Become A Successful Entry-Level Data Analyst
Companies across every industry rely on big data to make strategic decisions about their business, which is why data analyst roles are constantly in demand.
106. Digital Marketing: Empowering Decision-Making with AI-Driven Data Analytics
AI-powered data analytics revolutionizes digital marketing by enabling real-time insights, predictive analysis, and personalized strategies
107. The Operational Analytics Loop: From Raw Data to Models to Apps, and Back Again
Over the next decade or so, we’ll see an incredible transformation in how companies collect, process, transform and use data. Though it’s tired to trot out Marc Andreessen’s “software will eat the world” quote, I have always believed in the corollary: “Software practices will eat the business.” This is starting with data practices.
108. How to Get Started with Data Governance Best Practices
Long recognized as a must in the data-driven world, data governance has never been easy for big and tiny organizations alike.
109. How to Implement Propensity Score Matching: A Step-by-Step Guide
Find out when to implement Propensity Score Matching and how to use it with a detailed framework and its steps.
110. Intro to AI Analytics and Top 5 Use Cases for Businesses
Analytics works by extracting meaningful patterns in data and interpreting and communicating them.
111. Trends That Will Impact Data Analytics, AI, and Cloud in 2023
As we enter 2023, the world of analytics, AI, and cloud is entering an exciting new phase, with a wide range of innovations and developments set to reshape the
112. Merging Datasets from Different Timescales
One of the trickiest situations in machine learning is when you have to deal with datasets coming from different time scales.
113. 2020: Our Meatless, Cashless, City-less Future
Happy New Year! 2019 has come and gone like Kylo Ren’s reign in The Rise of Skywalker, and so it's time for my annual prediction piece.
114. How to Build Machine Learning Algorithms that Actually Work
Applying machine learning models at scale in production can be hard. Here's the four biggest challenges data teams face and how to solve them.
115. 693 Stories To Learn About Data
Learn everything you need to know about Data via these 693 free HackerNoon stories.
116. Talk Data to Me: The Art of Analytics Translation
As organizations increasingly rely on data, the demand for analytics translators is skyrocketing, making it a must-have role for future success. Think of an ana
117. Executing a T-test in Python
In today’s data-driven world, data is generated and consumed on a daily basis. All this data holds countless hidden ideas and information that can be exhausting
118. My Weird Career Transition From MBA to Data Science
Yes you read it correctly! I am calling my transition from being an MBA to being the Analytics Manager in a well known consumer retail brand a "WEIRD" one. And why do I say that? Because during my 5 year journey in data science, I have had the opportunity to work with a lot of business stakeholders like marketing head, brand managers, sales heads etc. and many a times they have asked me about my educational background. I would like to think that they asked this because of my ability to present the solutions keeping the business context and execution feasibility in mind. Well, the reason for asking this might be different for every individual, when I tell them that I am an MBA, their reply has always been the same, which is "What made you choose a technical career path after pursuing MBA?" And hence I decided to write this post to share my thoughts over 2 things:
119. 5 Strategic Digital Transformation Domains for Your Small Enterprise
The digital era has largely changed how we do business.
120. Transforming Data into Insight: A Beginner’s Guide Using Microsoft Excel
In this guide, I'll take you through a simple, three-step process - Prepare, Analyze, Consider.
121. Top 3 Benefits of Insurance Data Analytics
The Importance of data analytics and data-driven decisions across the board and in this case insurance data.
122. Deep Dive Into Open Source BI Tool Helical Insight
When Helical Insight first announced a couple of years ago that they were releasing an Open Source Business Intelligence (BI) tool, it really caught my interest and I reached out to founder Nikhilesh Tiwari to find out more about what he was doing. I spent a little time with the product and really liked where it was going and was determined to do more of a deep dive in the future, and with their release of version 3.0, that time is now.
123. Database Vs Data Warehouse Vs Data Lake: A Simple Explanation
A data lake is totally different from a data warehouse in terms of structure and function. Here is a truly quick explanation of "Data Lake vs Data Warehouse".
124. New WEF Report Reveals Jobs on the Brink of Extinction
Explore the future of work (2025-2030)! Learn which jobs are rising, which are fading, and how to stay ahead in the evolving job market.
125. An Introduction to Data Connectors: Your First Step to Data Analytics
This post explains what a data connector is and provides a framework for building connectors that replicate data from different sources into your data warehouse
126. 3 Ways You Can Build and Update Websites Using Data Pushes

Data is getting more and more accessible and is increasingly being used to inform the way businesses operate.
127. How Self-Service Analytics Creates a Major Shift In Product Mindset
Interview with Amir Movafaghi, CEO at Mixpanel and ex-Twitter VP, where we discuss why self-service analytics is here to stay as a driver of product-led growth.
128. Data Analytics Career Growth
A strong technical skill set is key but it is not enough in isolation. Combining technical expertise with the five themes discussed can be a superpower.
129. Breaking Down Data Silos: How Apache Doris Streamlines Customer Data Integration
Learn how Apache Doris breaks down data silos for insurance firms, streamlining customer data integration and boosting efficiency.
130. Future of Marketing: How Data Science Predicts Consumer Behavior
Gradually, as the post-pandemic phase arrived, one thing that helped marketers predict their consumer behavior was Data Science.
131. AI and Machine Learning for Manufacturing Industry: Use Cases
Artificial Intelligence(AI) has already proven to solve some of the complex problems across the wide array of industries like automobile, education, healthcare, e-commerce, agriculture etc. and yield greater productivity, smart solutions, improved security and care, business intelligence with the aid of predictive, prescriptive and descriptive analytics. So what can AI do for Manufacturing Industry?
132. Tales of the Undead Salmon: Exploring Bonferroni Correction in Multiple Hypothesis Testing
Bonferroni correction as a solution for multiple comparisons problem in A/B tests. Here is an explanation of how it works with a simulation written in Python.
133. The Rise of Reusable SQL-based Data Modeling Tools and DataOps services
The resurgence of SQL-based RDBMS
134. 5 Upcoming Online Machine Learning Conferences in 2020
Machine learning conferences have always played an important role in the world of data science. They're a place to announce new research, discuss current issues, and connect with the community. They also help to promote new areas of research and development through Q&A sessions, workshops, and tutorials.
135. Use Up-Sampling and Weights to Address Imbalance Data Problem
Have you worked on machine learning classification problem in the real world? If so, you probably have some experience with imbalance data problem. Imbalance data means the classes we want to predict are disproportional. Classes that make up a large proportion of the data are called majority classes. Those that make up a smaller portion are minority classes. For example, we want to use machine learning models to capture credit card fraud, and fraudulent activities happens approximately 0.1% out of millions of transactions. The majority of regular transactions will impede the machine learning algorithm to identify patterns for the fraudulent activities.
136. Why Professions Are Adding Analytics to Their Skillsets
There are many different forms of data analytics, and these have different applications in business.
137. How We Use dbt (Client) In Our Data Team
Here is not really an article, but more some notes about how we use dbt in our team.
138. Data Lakehouses: The New Data Storage Model
Data lakehouses are quickly replacing old storage options like data lakes and warehouses. Read on for the history and benefits of data lakehouses.
139. Why AI is the Future of Restaurant Sales
Think about all of the things you could do with unlimited data and insights about your sales. Now, think about all of the things you could do with future data and insights about your sales?
140. Compete on Data Analytics using Spring Cloud Data Flow
Data Driven
141. Meet Neo Auth, Peris.ai, and Midas Analytics: HackerNoon Startups of the Week
Meet Neo Auth, Peris.ai, and Midas Analytics, HackerNoon startups of the week.
142. Analyzing Dogecoin Tweet Sentiment in Real Time
How to analyze Dogecoin tweet sentiment in real-time with a new managed Kafka platform.
143. Creativity in Data Analytics is About More than Data Visualization
I recently attended a networking event where I spoke to a range of graduates who were looking at prospective careers in the data science and adjacent spaces.
144. The Art of Data Storytelling: How to Make Your Data Impactful
Data is everywhere: whether you choose a new location for your business or decide on the color to use in an ad, data is an invisible advisor that helps make impactful decisions. With quite a number of resources to choose from, data is becoming more accessible, day by day. But as soon as it has been collected, one inevitable question arises: how do I turn this data into insights that can be acted upon?
145. 5 Reasons to Invest in Analytics For Your Startup Now
Data analytics are a startup's best friend, and here are five reasons why.
146. Behind the Scenes of Using Web Scraping and AI in Investigative Journalism
Learn how journalists utilize web scraping software for investigative research.
147. How to Improve Data Quality in 2022
Poor quality data could bring everything you built down. Ensuring data quality is a challenging but necessary task. 100% may be too ambitious, but here's what y
148. Drag, Drop, and Dominate: The Best Pivot Table Libraries for Web Apps
Explore the top JavaScript pivot table and OLAP tools and their notable features for your applications in this review of leading options.
149. Big Data Analysis for the Clueless and the Curious
Big data analytics has been a hot topic for quite some time now. But what exactly is it? Find out here.
150. A Quick Guide To Business Data Analytics
For many businesses the lack of data isn’t an issue. Actually, it’s the contrary, there’s usually too much data accessible to make an obvious decision. With that much data to sort, you need additional information from your data.
151. Beyond Correlation: How Econometric Statistics Power Real-World Decisions
Econometric statistics gives the "more." It adds rigor to ambiguity – structure into chaos; and evidence into intuition.
152. How to Consolidate Real-Time Analytics From Multiple Databases
Have you ever waited overnight for that report from yesterday’s sales? Or maybe you longed for the updated demand forecast that predicts inventory requirements from real-time point-of-sale and order management data. We are always waiting for our analytics. And worse yet, it usually takes weeks to request changes to our reports. To add insult to injury, you keep getting taxed for the increasing costs of the specialized analytics database.
153. Restructure or Recycle: Making the Right Data-driven Decisions
Understanding the difference between restructuring and recycling data allows analysts to make better-educated decisions.
154. Metrics, logs, and lineage: 3 Key Elements of Data Observability
Data observability is built on three core blocks: metrics, logs, and lineage. What are they, and what do they mean for your data quality program?
155. Context Rot Is Breaking Long AI Sessions
Bigger context windows help, but not enough. Learn how Recursive Language Models improve long-context reasoning with better scaling and stable performance.
156. Step-by-Step Guide to SQL Operations in Dremio and Apache Iceberg
Learn to set up a robust data lakehouse environment with Apache Iceberg, Dremio, and Nessie for scalable SQL operations.
157. Data-driven Marketing: Unleashing the Power of Big Data for Targeted Campaigns
In today's digital era, the abundance of data has transformed the way businesses approach marketing.
158. Data Science From Scratch
Data Science, which is also known as the sexiest job of the century, has become a dream job for many of us. But for some, it looks like a challenging maze and they don’t know where to start. If you are one of them, then continue reading.
159. Applications of Predictive Analytics in your Recruitment Journey
Elanor is an HR executive at Unicorn marketer. She’s been involved in the recruitment process for six years now. Every year they do a campus drive at the most prestigious college in Chicago. They’re always on the look for a promising candidate for a challenging role as a Digital Marketer. Elanor has been maintaining a spreadsheet of rejected candidates for the same post and logging the reasons for rejection as well.
160. How to Build a Data Dashboard Using Airbyte and Streamlit
In this tutorial, we built a real-time data dashboard using Airbyte and Streamlit, in Python programming language.
161. Scraping Data With Selenium: Upwork Series #2
Hi Devs!
162. Data and Analytics Predictions for 2020 [A Top 5 List]
It would be no exaggeration to say that the capacity of technology to advance itself is proceeding at a faster rate than our ability to process these changes all at the same time. This is both amazing and alarming in the same breath.
163. Data-Driven Approach for Software Engineering: How to Avoid Common Problems
In today’s digital world, data is constantly being generated, evaluated, and updated. It also plays an important role in the work of software engineers by providing accurate, actionable feedback that helps engineers understand where and how to make improvements to a product or process.
164. How Color Psychology Impacts Branding
While there is still much research to be done, color psychology has been used in fields such as marketing and design to help create appealing appealing products
165. Building a Data Management Strategy: Importance, Principles, Roadmap
Already routinely called the currency, the lifeblood, and the new oil of the modern business world, data promises organizations unbeatable competitive advantages.
166. How to Setup Your Organisation's Data Team for Success
Best practices for building a data team at a hypergrowth startup, from hiring your first data engineer to IPO.
167. An Intro to User Analytics in the Gaming Industry
Gaming analytics can be best defined as is the whole process of applying user behavior data to guide sales & marketing, product enhancements, etc
168. The Future of Sports Analytics: Ricky Zhang's Groundbreaking 'Shot Quality' Project
Ricky Zhang’s 'Shot Quality' project uses machine learning to redefine basketball analytics, offering deeper insights into player performance.
169. 4 Ways Cities Are Utilizing Data for Public Safety
Cities have been using data for public safety for years. What new technology is emerging in public safety, and how does it affect you?
170. Machine-Learning Neural Spatiotemporal Signal Processing with PyTorch Geometric Temporal
PyTorch Geometric Temporal is a deep learning library for neural spatiotemporal signal processing.
171. Automate Submissions for the Numerai Tournament Using Azure Functions and Python
Python Automation with Azure Functions, to compete in the weekly Numerai tournament.
172. Why Data Governance is Vital for Data Management
Both data governance and data management workflows are critical to ensuring the security and control of an organization’s most valuable asset-data.
173. Enhancing Data Preparation With AI for Business Intelligence
Learn how data-centric AI can help automate data preparation for business intelligence, ensuring reliable conclusions for subsequent data analysis with Cleanlab
174. Zipping up Lambda Architecture for Faster Performance
Lambda segregates real-time and offline big data processing. Our pipeline implements separate pipelines for each data type, allowing for efficient processing.
175. The Critical Role of Customer Data in Creating Personalized Product Experiences
With 97.2% of businesses investing in data and AI, one thing is clear: data isn’t a “nice to have” anymore; it’s a necessity.
176. Growing Data Infrastructure Complexities: Cost Implications and the Way Forward
A deep dive into the journey of data infra– from traditional databases to the Modern Data Stack as it exists today, challenges in scaling, and upcoming trends
177. Accelerating Excavation and Refinement of Data Gold Mines
Unlock the potential of data-driven decision-making with generative AI and NLP.
178. Why "Big Data" is No Longer Relevant in the Age of Machine Learning and Deep Learning
Discover why "Big Data" is no longer relevant with the rise of Machine Learning and Deep Learning. Learn how these technologies transform data analytics.
179. Python Prevails: 57% Choose Python As Their Go-to Data Science Tool
Data science uses advanced tools to extract meaning and answers from data through various programming, statistic, and communicative mechanisms.
180. Planning for Your Startup: The Data Team's Guide to 2021
Planning in a startup can feel like an exercise in futility — especially when it comes to data — especially when your data team is small and scrappy.
181. Hands-on with Apache Iceberg & Dremio on Your Laptop within 10 Minutes
From creating and querying Iceberg tables to managing branches and snapshots with Nessie’s Git-like controls, you’ve seen how this stack can simplify complex da
182. AI In Social Media: 8 Techniques To Stay Ahead Of Competition And Grow A Digital Presence
Social media is a powerful tool that enables businesses to connect with their customers and meet them at the touch points where they are.
183. Change Data Capture (CDC) When There is no CDC
How to handle changing data when the source system doesn't help.
184. How Advanced Data Analytics is Impacting B2B Sales
Machine Learning and data analytics have shown a pronounced effect on various aspects of the commercial world and industries. Enterprises are using innovation in the field of data analytics and machine learning to design better marketing campaigns. It also helps generate pricing and customer-centric recommendations and even plan more effective financial budgets.
185. Tailor Your Data Visualization Design Choices for Key Stakeholders to Create Organizational Buy-In
A guide to effective deployment of data visualizations in organisations for maximum business value. Adapted from Data Principles To Practice Volume II
186. So You Just Became a Data Science Manager... Now What?
With the rise of data science there has been the rise of data science managers. So what do you need to keep in mind if you wish to join these data translators that are acting as a conduit between the business and technical data teams? Going from a practitioner to a manager — your job now is to make sure that data resources are being used optimally so how do you go about doing this effectively?
187. Data Science With R Programming — Coding Interview Questions
R is a tool used for data management, storage, and analysis in the field of data science. It has applications in statistical analysis and modeling.
188. 7 Data Analysis Steps You Should Know
To analyze data adequately requires practical knowledge of the different forms of data analysis.
189. Approaching Cricket Analytics With Python and Indian Premier League Data
This project originated as an Independent Research Topic Investigation for the Data Science for Public Affairs class at Indiana University.
190. How to Set Up Your Own Google Analytics Alternative Using Umami
Learn how to set up your own privacy-focused, open-source web analytics platform using Umami as a simple and powerful alternative to Google Analytics.
191. How Datadog Revealed Hidden AWS Performance Problems
Migrating from Convox to Nomad and some AWS performance issues we encountered along the way thanks to Datadog
192. I Gave 5 Teams the Same Dashboard - Only 1 Made a Decision With It
Build for the decision, not the data. If you can't name the specific decision a dashboard is supposed to support, you're building a museum exhibit
193. Why Self-Service Analytics Tools Are Important For Business Decisions Making
How to use Big Data, Self-Service Analytics Tools and Artificial Intelligence to Empower your Company Business Decisions Makers with State Of The Art Software
194. Why Python Is Leading the Charge in Data Analytics
Python is one of the oldest mainstream programming languages, which is now gaining even more ground with a growing demand for big data analytics. Enterprises continue to recognize the importance of big data, and $189.1 billion generated by big data and business analytics in 2019 proves it right.
195. How to Grow your Video Business with Data
TV watching used to be a family affair a decade ago, but today in most households, content watching has become a personal activity.
196. 6 Reasons to Use Amazon Redshift
A quick guide to Amazon Redshift's benefits and use cases. Learn why your team might want to make the SHIFT to Amazon Redshift.
197. Which Database Is Right For You?Graph Database vs. Relational Database
Learn about the main differences between graph and relational databases. What kind of use-cases are best suited for each type, their strengths, and weaknesses.
198. High-Utility DeFi Data Analytics Tools For Crypto Investors
These four growing platforms will give investors the tools they need to make smarter decisions
199. Measuring True Campaign Uplift in Noisy E-Commerce Data: A Practical Heuristic Approach
A practical heuristic approach to measuring true campaign uplift in noisy e-commerce data without relying on A/B tests.
200. The Independent Phone : More Privacy, Less Freedom?
Freedom and privacy tend to go together, but there is a difference. With a more private phone, does it really mean you have more freedom?
201. How Retailers Can Leverage Personalization to Drive Customer Centricity in the Metaverse Era
The next frontier for personalization at scale is in VR and AR, and the next frontier of retail is consumer-first
202. Rust DataFrame Alternatives to Polars: Meet Elusion v4.0.0
Elusion is a new contender that takes a fundamentally different approach to data engineering and analysis.
203. Are You Poisoning Your Data? Why You Should Be Aware of Data Poisoning
As machine learning gains more prominence, these attacks may become more common. Here’s a closer look at data poisoning and what companies can do to prevent it.
204. 6 Places to Start a Career in Data Science
How to become a data scientist?
Want to become a Data Scientist? Here are the resources.
Resources to Become a Data Scientist
205. On-Chain Data Product Insights: The Data Analysis Revolution in the Web3 Era
In the rapidly evolving blockchain space today, on-chain data has become a core asset with an increasingly vital role in the ecosystem.
206. Spotify Audio Features Time Series in Additive Spotify Analyzer
There are many articles on analyzing Spotify data and many applications as well. Some are a one-time analysis on individual's music library and some are an app for a specific purpose. This app is different in that it does not do one thing. It is meant to grow and provide a place to add more analysis. This article is about how the audio features time series was created.
207. Hope Is Not a Strategy in Fintech
The shift from mid-level to senior engineering thinking happens when you stop asking “will this work?”
208. Tableau Vs. Power BI: The Complete Comparison
The world of analytics is continually evolving, introducing new goods and adjustments to the modern market. New companies are entering the market and well-know
209. Less Components, Higher Performance: Apache Doris instead of ClickHouse, MySQL, Presto, and HBase
An insurance company tries to build a data warehouse that can undertake all their customer-facing, analyst-facing, and management-facing data analysis workloads
210. The Ultimate Directory of Apache Iceberg Resources
This article is a comprehensive directory of Apache Iceberg resources, including educational materials, tutorials, and hands-on exercises.
211. HarperDB is More Than Just a Database: Here's Why
HarperDB is more than just a database, and for certain users or projects, HarperDB is not serving as a database at all. How can this be possible?
212. Redpanda Lands a $100M Series C Funding Round - Interview With Alex Gallego
In an era of dried-up funding and Data Lakehouse vendor supremacy, Redpanda is going against the grain. The company just secured a $100 million Series C funding
213. Intro to Data Vault Modeling: Agility, Scalability, and Practical Applications Explained
The practical use of Data Vault models, as illustrated through querying customer orders and analyzing product sales, demonstrates the methodology's flexibility,
214. 5 Trends Shaping the Future of Data Analytics and Insights
Discover the 5 key trends shaping the future of data analytics—from synthetic data to NLP, data interoperability, data storytelling, and new data-centric roles.
215. Top 6 Mobile Analytics Tools of 2020
Data has become an increasingly important factor when it comes to the health of any app or website. Having all of your important numbers such as the number of downloads, amount of money generated from downloads and even the most recent feedback is the key to continued success.
216. 253 Stories To Learn About Data Analysis
Learn everything you need to know about Data Analysis via these 253 free HackerNoon stories.
217. KYVE Mainnet Goes Live on Pi Day, Opening The Doors To Truly Trustless Data In Web3
KYVE, the decentralized data lake, mainnet officially live, opening the doors to truly Trustless data in web3.
218. COVID-19: We Need More Than Data, We Need Insights!
TL;DR We are managing the pandemic situation only with part of the data and not necessarily representative of reality. We must take a census of the number of positive and negative cases within a population. The officially reported positive cases contain a bias: they are cases that already manifest the disease in a more or less serious way. In the long term, the strategy of aggressive testing (South Korea model) is the only viable and sustainable to manage coexistence between the virus and the human beings until a vaccine will be available.
219. Software Development for the Nuclear Industry
One of the biggest problems facing leaders in the nuclear energy industry is the aging infrastructure in the United States and abroad.
220. Making Data-Driven Decisions in MVP Development
Explore how data-driven decisions can help you in your MVP development. Optimize your user experience and drive business growth by leveraging data analytics.
221. 229 Stories To Learn About Data Analytics
Learn everything you need to know about Data Analytics via these 229 free HackerNoon stories.
222. 5 Ways to Become a Leader That Data Engineers Will Love
How to become a better data leader that the data engineers love?
223. Maximizing Value of FP&A With Enterprise Planning Management
Unlock growth and optimize financial performance with Enterprise Planning Management. Harness insights through FP&A for strategic value creation.
224. The Most Commonly Used SQL Queries by Data Scientists
SQL (Structured Query Language) is a programming tool or language that is widely used by data scientists and other professionals
225. 5 DBT Repositories You Need to Star on GitHub
The 5 hottest dbt Repositories you should star on Github 2022 - Those are mine!
226. A Well-intentioned Cashback Program Caused an Increase in Fraud—Here's What Happened
Discover how our cashback strategy unexpectedly led to an increase in fraudulent activities. Learn from our A/B test results and insights on preventing fraud.
227. 143 Stories To Learn About Data Visualization
Learn everything you need to know about Data Visualization via these 143 free HackerNoon stories.
228. 5 Big Data Problems and How to Solve Them
“Big Data has arrived, but big insights have not.” ―Tim Harford, an English columnist and economist
229. From Novice to Data Pro in 90 Days: Avery Smith's Exclusive Method
Get Hired in Data Analytics Within 90 Days
230. Reimagining What Visual Data Transformation Tools Should Look Like
Data is not code. Professional analytics is not Python\SQL coding. We value our time, and time to value. Data people deserve the best tooling possible.
231. BigQuery and Attribution Models Can Reveal What Really Drives E-Commerce Success
Discover 7 powerful attribution models to analyze and optimize user journeys using BigQuery
232. Women in Tech: Azize Sultan Shares Her Inspiring Journey from Architecture to Tech Leadership
Azize Sultan shares her inspiring journey to tech leadership, tackling gender gaps, challenges, and offering advice for aspiring women in the tech industry.
233. 23andMe and Other Sites are Selling Users' Genetic Data: How Safe is Your DNA?
How genetic information from sites like 23andMe and Ancestry.com is being shared and sold.
234. The Persona Process Is Broken—Here’s the Faster One
Most companies don’t plateau at $2–10M ARR because of product or marketing. They plateau because they’re building for “everyone who might buy,” instead of a ...
235. Data Analytics is a Journey
It is 2020 and the data analytics has gained so much attention even outside of the tech community. "Data is gold", they say - no one wants to be left behind. However, getting the right strategy is neither a straightforward nor static process.
236. Self-Service Business Intelligence and How to Do It Properly
Self-service business intelligence, or BI, has been on the to-do list of many organizations for quite a while.
237. Predictive Data Mining Can Help Forecast the Online Behavior of Consumers (Podcast)
In this episode, we discuss how the company first began, how it has grown, and the solutions it currently offers.
238. What Is A Data Mesh — And Is It Right For Me?
Ask anyone in the data industry what’s hot and chances are “data mesh” will rise to the top of the list. But what is a data mesh and is it right for you?
239. Privacy Enhancing Technologies: Top 3 Use Cases
Security and risk management leaders can apply privacy-enhancing tech in AI modelling, cross-border data transfers, and data analytics to manage constraints.
240. How to handle your startup data like a big tech
Core principles in data management that all big tech companies adhere to can and should be adopted by startups.
241. Leveling Up Your Data-Driven Product Development With Posthog
Posthog is an open-source product analytics platform that offers flexibility and control.
242. Moving Beyond Dashboards: Rethinking Analytics in the Era of Ad Hoc Requests
Let's talk about the Pareto law, the dashboard fallacy, and how to answer the hardest question in analytics
243. Data Lineage is Like Untangling a Ball of Yarn
Data lineage is a technology that retraces the relationships between data assets. 'Data lineage is like a family tree but for data'
244. 188 Stories To Learn About Analytics
Learn everything you need to know about Analytics via these 188 free HackerNoon stories.
245. 5 Best Data Curation Tools for Computer Vision in 2021
In this article, we’ll dive into the importance of data curation for computer vision, as well as review the top data curation tools on the market.
246. Three A/B Testing Mistakes I Keep Seeing (And How to Avoid Them)
The three most common mistakes in A/B testing analysis involve the Mann–Whitney test, bootstrapping, and default Type I and Type II error rates.
247. Discover Funnel Bottlenecks: Step-by-Step Analysis with BigQuery
Learn how to use BigQuery for e-commerce funnel analysis. Track user transitions between steps like “add to cart” and “purchase,” and identify where to improve
248. What Should You Do to Trust Event Data? Part 1 – Events Catalogue
Struggling with messy event data in event-based analytics? See practical insights on organizing event definition data to make them work.
249. Event Time Processing with Flink and Beam - Power of Real time Analytics
If we can answer what, where, when and how of data processing we can build a very robust stream processing pipeline using Apache Flink and Apache Beam.
250. 96 Stories To Learn About Data Engineering
Learn everything you need to know about Data Engineering via these 96 free HackerNoon stories.
251. Getting to Know Google Analytics 4: Four Smart Features You Don’t Know About
Let’s take a deeper look into Google Analytics 4 and explore some of its key features that you might not yet know about.
252. Solving Noom's Data Analyst Interview Questions
Noom helps you lose weight. We help you get a job at Noom. In today’s article, we’ll show you one of Noom’s hard SQL interview questions.
253. The Need for Data Analytics to Flood-Proof Property Investment
Did you know that the total risk of floods isn't accounted for in urban planning in the US due to a denial of climate change?
254. Conversational Analytics: the Next Generation of Data Analysis and Business Intelligence
The article talks about how data analytics is evolving at workplaces from traditional querying , excel and dashboards to natural language conversations
255. How Smart Analytics Can Help Small Businesses Boost Sales
Technology has taken over the world, now is the time for small businesses to realize that what they need is tech. Smart analytics makes everything easier.
256. Data Management in 2024: Will Open Data Formats Shape a “Sixth Platform”?
Can open data formats lead to a best-of-breed data management platform? It will take Interoperability across clouds & formats, as well as semantics & governance
257. Navigating the Urban Grid: Spatial Data Management for Smart Cities
Explore the transformative role of spatial data management in smart cities, addressing its applications, challenges, and benefits.
258. Listen to That Poor BI Engineer: We Need Fast Joins
Yes, you can expect fast joins from a relational database.
259. Behavioral Analytics: The Foundation of Targeted Marketing and Predictive Analytics
Learn how to capitalize on your business standards and increase the conversion rate by approximately 85% by analyzing customer behaviors with data you collect.
260. Is Your Reporting Software WCAG Compliant? Make Data Accessible to Everyone with Practical Steps
One billion peoplee xperience some form of disability. Like any other software, it should be equally accessible to user
261. How Data-Driven Coaching Helps Employees Reach Their Potential
Data is everywhere. In the business world alone, we use it to track search engine traffic, monitor website activity, land sales, improve customer service.
262. Data-Driven Talent Management: The Long Road Ahead
While we’ve been long reliant on computers and the internet to work and collaborate, an entirely officeless organization is a recent notion.
263. Thrilled to be Recognized as Contributor of the Year - Data Science & Data Analytics
Hooray! We have made it to the Hackernoon Awards. Xtract.io, the data provider's company is happy and elated to be part of #noonies2021. Join us in our victory!
264. The Role of AI and ML in Enhancing The Ability Of Multiplying Wealth
Landing a good job is generally considered the purpose of education today.
265. The Most Dangerous "AI" in Business Intelligence is the One That Sounds Right
Fluent AI answers can quietly violate BI rules. This story shows how unanchored AI creates silent governance failures in enterprise analytics.
266. What Are the Key Differences Between Qualitative and Quantitative Data?
This article uncovers the key differences between qualitative and quantitative data with examples.
267. Financial Anti-Fraud Solutions Available on the Apache Doris Data Warehouse
This post will get into details about how a retail bank builds their fraud risk management platform based on Apache Doris and how it performs.
268. Top 7 Use Cases of Predictive Analytics in Healthcare
“I’ve never been able to predict the future of anything”, said Bob Edwards, one of the most accomplished American journalists.
269. What Are the Most Common Mistakes Made by Aspiring Programmers?
I wasted A LOT of my time teaching myself the basics of coding, machine learning, and stats.
270. 7 Ways to Beat Zoom Fatigue and Improve Your Virtual Meetings
Zoom fatigue is plaguing productivity. Explore fun and data-driven ways to overcome the challenge and to add some personality to your next zoom call.
271. Understanding the Differences between Data Science and Data Engineering
A brief description of the difference between Data Science and Data Engineering.
272. Hack Your Way to LookML Mastery By Following These Tips
Tips to help BI developers create a seamless pipeline in Google's visualization tool Looker.
273. Get Started With Big Data Analytics For Your Business.
Everything we do generates Data, therefore we are Data Agents. The question is: how we can benefit from this huge amount of data generated every day?.
274. More A/B Tests Won’t Fix Your Growth Problem
A veteran growth leader explains when A/B testing drives results—and when it slows your team down. Learn how to balance speed and accuracy.
275. Navigating Architectural Trade-offs at Scale to Meet AI Goals in 2026
Success in 2026 is predicated on having total clarity of the underlying data infrastructure.
276. Data Analytics: Apache Doris' Impact in Reporting, Tagging, and Data Lake Operations
Delve into Apache Doris, a data powerhouse revolutionizing analytics for fintech with high-performance and scalable operations.
277. What You Need to Consider When Hiring a Data Scientist
Dubbed “the sexiest job of the 21st century” by the Harvard Business Review, the demand for data scientists has grown dramatically. The number of job postings for this career increased by 31% from December 2017 to December 2018. And over the course of the last 6½ years, postings have surged by a staggering 256%.
278. Why Businesses Need Data Governance
Governance is the Gordian Knot to all Your Business Problems.
279. How to See Areas in Your Organization Where Data can Make a Difference
What is the first thing that you do when you start a new data science or analytics role?
280. How to Improve VC Deal Sourcing Using Public Web Data
Learn how public web data can help you improve your deal sourcing methods.
281. What Kind of Skills Are Required to Become a Data Analyst?
Discover the essential skills required to become a successful data analyst, including technical tools, analytical abilities, and key competencies for thriving.
282. Understanding Elasticsearch Reindexing: When to Reindex, Best Practices and Alternatives
Whether you're a seasoned Elasticsearch user or just beginning your journey, understanding reindexing is important for maintaining an efficient cluster.
283. The Importance Of Data in Sales in 2022

284. Why Data Governance in Healthcare Matters in 2024 With Nithin Narayan Koranchirath
An interview with Nithin Narayan Koranchirath about the power of data governance in healthcare, protecting patient privacy, and improving outcomes.
285. Lifecycle of a BI Report
Sam, a savvy Business Analyst, embarks on a mission to craft a BI report. Through his adventure, we'll unravel the captivating lifecycle of a BI report.
286. Top 5 Factors Behind Data Analytics Costs
A custom integrated data analytics solution would cost at least $150,000-200,000 to build and implement.
287. Data Potential: 10 Reasons Apache Iceberg and Dremio Should Be Part of Your Data Lakehouse Strategy
Discover the powerful synergy of Apache Iceberg and Dremio, revolutionizing data management and analytics.
288. The Metrics Review Ritual That Turns Product Work Into Revenue
If I could point to one of the turning points of that made Codecademy’s revenue takeoff, it was the introduction of a metrics review process to the product t...
289. Watch Out for Deceitful Data
Nowadays, most assertions need to be backed with data, as such, it is not uncommon to encounter data that has been manipulated in some way to validate a story.
290. Building Asset and Risk Management on Codebase with Semgrep
Get structured api handlers, database tables, clients calls from microservice with semgrep rules, score risk, prioritize appsec routines and monitor changes.
291. New Power BI Features For More Streamlined Data Analysis
Here are the new features of Power BI (unveiled at the Microsoft Ignite 2021) that can be absolutely beneficial for business users.
292. The Gartner Hype Cycle Report and the Future of Data
Gartner identifies data labeling as one of the key factors responsible for the ongoing evolution of AI technology and rapid AI-powered product development.
293. Why Most “Data-Driven” Companies Still Make Bad Decisions
Many companies claim to be data-driven, yet decision-making is still slow and manual. Here’s why data alone doesn’t guarantee better business decisions.
294. Why Businesses Need to Take Full Advantage of IIOT and Data Analytics
Modern business is driven by digital technology, and yet many business leaders remain hesitant to adopt them.
295. The Recommendation Engine Behind Your Cart: Design, Build, Maintain
Explore pipeline design, Kafka/Kinesis decoupling, and the monitoring that prevents “green lights” from lying.
296. Create a Search Engine and Other Startup Ideas Using Data-Ferret
Data-ferret is a tiny, yet powerful util library to scan or transform deeply nested and complex object-like data with ease.
297. 89 Stories To Learn About Big Data Analytics
Learn everything you need to know about Big Data Analytics via these 89 free HackerNoon stories.
298. What Is Big Data? Understanding The Business Use of Big Data Analytics
Big data analytics can be applied for all and any business to boost their revenue and conversions and identify their common mistakes.
299. Using Data Analytics for Unhindered Business Growth
Every business, regardless of the size and spread, requires data analytics support to thrive. These Top Data Analytics Trends will help you grow your business.
300. Preserving Customer Privacy: Integrating Differential Privacy with Versatile Data Kit (VDK)
Safeguard customer privacy using Differential Privacy integrated with Versatile Data Kit (VDK) for ethical data management.
301. Why Datasets are Crucial to Data Science: the Key to Informed Decisions
Datasets are crucial for anyone wanting to learn data science.
302. Capturing Trends in HealthCare at 1mg (E-pharmacy Unicorn)
Recommendation in Healthcare with simple analytics to show most trending products on the platform.
303. Leveraging Loyalty: Can Businesses Digitize Customer Retention?
Customer retention is a core component for all ambitious businesses, but can loyalty be digitized to great effect?
304. Investors Clamor for Digestible Data Analytics in the Fledgling Crypto Industry
As DeFi data generation grows with the industry, there is an increased need for platforms that are able to digest and analyze this data for investors.
305. Improving Healthcare Analytics and Implementation with Talend
Talend Open Studio for Data Integration can benefit direct marketers, offering several inbuilt Business Intelligence tools.
306. How to Build Connections for A/B Testing and Linear Regression: An Essential Guide
In a world of LLM and cutting-edge architectures, linear regression quietly plays a crucial role, and it’s time we shine a light on how it can be beneficial.
307. Qwen3.5-27B Distilled Model Cuts Reasoning Costs Without Losing Accuracy
Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-v2-GGUF delivers shorter reasoning chains and 96.91% HumanEval pass@1.
308. A Leader's Guide to Data-Driven Success
Transform data from a source of frustration into a powerful business tool with this practical guide for executives.
309. Business Intelligence Is So Important For Development In Healthcare: Here's Why
Only well-implemented EHR systems with built-in analytical software, served by professionals, can quickly extract information with decision-making potential.
310. How Much Can You Make as a Data Scientist?
Wondering how much data scientists make? We're here to help you find out about salaries in Data Science and how they are influenced by various factors.
311. Elasticsearch VS Apache Doris in Log Analysis
Discover how Apache Doris revolutionizes log analysis. From schema-free support to cost-effective storage, learn how to build an efficient log analysis system.
312. AI, Automation & the Future of Manufacturing: 3 Problems to Solve?
It's time to start thinking about the future – and the future is now when it comes to artificial intelligence (AI) in manufacturing.
313. Analyzing Data: What Are Text Mining and Text Analytics?
What are text mining and text analytics?
314. "Connect, Analyze and Learn from Data" - Dr. Yu Xu
Welcome to "Mondays with Entrepreneurs". This week we have with us an Entrepreneur and tech expert who thinks Monday is the most exciting day of the week.
315. Privacy Protection and Web3 Analytics
Though there have been more and more developers and product designers joining Web3.0 world in recent years, it is almost ignored by most of them that they are still using centralized infrastructure — data analytic tools — to build apps and webs. Every minute, project builders are making themselves part of the reason for data breach events, as they have to collect user data intendedly or unintendedly for product improvement.
316. Revolutionizing Data Analytics with AI: A Seven-Step Odyssey
Transforming Data Analytics with AI
317. Visualization of Hypothesis on Meteorological data
In this blog, we are gonna perform the analysis on the Meteorological data, and prove the hypothesis based on visualization.
318. 'At the Coalface of Implementing Data Stacks': kleene's Co-founder & CEO Andrew Thomas
2-minute look at the building of kleene.ai through a founder's eyes.
319. A Brief Introduction Into A Typical Data Science Project Life Cycle
In this post, I demystified data science and talked about the lifecycle of a typical data science project. It's a good read for everyone.
320. AI Is Making Our Concrete Buildings And Bridges Safer
AIs application to civil engineering and concrete construction is the future of structural safety. There have been various successful & innovative applications.
321. Corporate Lending - The Impact of Artificial Intelligence and Data Analytics on Financial Services
In this comprehensive exploration, we delve into the revolutionary impact of artificial intelligence (AI) and data analytics on the corporate lending landscape
322. The Essential Skills Every Marketer Needs
This article will help you understand the demand for digital marketing and the skills you will need to enter a digital marketing career
323. The Direct Lake Mirage: What Really Happens at 99 Million Rows
A real 99M-row benchmark reveals why Import Mode still outperforms Direct Lake in Microsoft Fabric and what the engine truth means for your BI architecture.
324. 5 Most Important Tips Every Data Analyst Should Know
The 5 things every data analyst should know and why it is not Python, nor SQL
325. Secure Enclaves and ML using MC²
Announcing the official release of MC², a platform for secure analytics and machine learning.
326. Things to Consider When Looking For Data Science Roles
There is a great demand for data scientists presenting market dynamics that are favourable for the community. More so than your peers in other professions, you will be able to evaluate a company for what it is able to offer you, rather than solely being the one that is being evaluated. So what should you look for when comparing and evaluating data science roles? Here is a list of some commonly known factors plus some less discussed ones that will help you in your evaluation.
327. Strategies and Best Practices for Ensuring Data Consistency

328. The Importance of Monitoring Big Data Analytics Pipelines
In this article, we first explain the requirements for monitoring your big data analytics pipeline and then we go into the key aspects that you need to consider to build a system that provides holistic observability.
329. How to Define Data Analytics Capabilities
Disclaimer: Many points made in this post have been derived from discussions with various parties, but do not represent any individuals or organisations.
330. Data Journalism 101: 'Stories are Just Data with a Soul'
Gone are the days when journalists simply had to find and report news.
331. Data Modeling - Entities and Events
Both events and entities have unique roles in data modeling, and understanding when to use each is crucial for building effective data platforms.
332. Moving From the Flat Earth: Why We Should Switch to Data-Driven Finance
Businesses should switch from linear formulae to data-driven finance. This will allow companies to not only get an immediate revenue boost!
333. How Data Analytics is Changing the Restaurant Industry
The integration of POS and advanced analytics help businesses get to the best, single view of separate customers needs across different restaurant outlets.
334. How to Democratize Access to Data Insights for Businesses of All Sizes
Messy government data has been part of the reason we've been unable to understand the COVID-19 pandemic. If federal organizations can't decode big data, what hope do small businesses have?
335. An Introduction to Data Automation for Business Efficiency
In today’s competitive business landscape, data automation has become necessary for business sustainability. Despite the necessity, it also comes with a few challenges--collecting, cleaning, andputting it together--to get meaningful insights.
336. Applying Criminology Theories to Data Management: "The Broken Window Theory: and "The Perfect Storm"
What can be done to prevent “Broken Windows” in the primary data source? How can we effectively fix existing “Broken Windows"?
337. The Future is Visual: The Image Search Revolution
Images surround us everywhere. Traditionally used keyword-based search is often not sufficient, as it cannot capture the richness of visual content.
338. Harnessing the Power of Data Science in Sports
Data Science and analytics in the sports market is expected to increase to $2.93 billion at a rate of 20.65%. According to a survey conducted by KPMG, 97% of sports professionals believe that technology, including data science and analytics, will have a significant impact on the sports industry in the coming years.
339. Want To Earn 100k and Above? Then Look to Data Science Jobs
Data science is more vital than ever in the AI era, offering high salaries and essential skills for tech professionals.
340. What Is Modern Business Intelligence?
This article gives insight into some basic features and functionality that a desirable modern BI software has and illustrated some examples.
341. Stop Deleting Outliers—Here’s What You Should Do Instead
Learn 3 simple, effective methods to detect and handle outliers in your data. Improve analysis accuracy and make smarter decisions with clean datasets.
342. How To Get Real-Time Analytics By Consolidating Databases
Benchmark a Hybrid Transactional and Analytical RDBMS (Photo: Sawitre)
343. Using Data Analytics Effectively in Marketing
How to make your data work harder for you in marketing
344. Amazon Kinesis: The AWS Data Streaming Solution
Quick Guide of Amazon Kinesis which contains the Amazon Kinesis Introduction, Top Advantages & Use Cases of Amazon Kinesis.
345. DuckDB in Action: A Review
This review is about DuckDB in Action by JoMark Needham, Michael Hunger, and Michael Simons from Manning.
346. Leveraging Python's Pattern Matching and Comprehensions for Data Analytics
Pattern matching allows for more intuitive and readable conditional logic by enabling the matching of complex data structures with minimal code.
347. 6 Pitfalls to Avoid When Transitioning To a Data Science Career
If you are considering the transition to a data science career these are common mistakes and traps you'll want to avoid.
348. Accelerating Analytics by 200% with Impala, Alluxio, and HDFS at Tencent
This article describes how engineers in the Data Service Center (DSC) at Tencent PCG (Platform and Content Business Group) leverages Alluxio to optimize the analytics performance and minimize the operating costs in building Tencent Beacon Growing, a real-time data analytics platform.
349. Business Intelligence vs. Data Analytics: Deciphering the Distinction
Draw a line between Data Analytics and Business Intelligence — compare their scope, capabilities, use cases, and potential for business decision-making.
350. Sreenivasarao Amirineni: How Machine Learning Contributes to Modern Business Efficiency
Explore how Sreenivasarao Amirineni leverages machine learning to enhance efficiency in the insurance sector, driving innovation and significant cost savings.
351. Using User Data After Google's Third-party Cookies Ban
Google announced that it would ban the usage of third-party cookies; it has made a lot of publishers afraid that they won't be able to utilize user data.
352. Startup Interview with Zoltan Csikos, Co-Founder & CEO, Neticle
Neticle offers a range of text analytics tools for businesses. If you have textual data to analyze, Neticle has a solution for you!
353. Employee Training: How to Make Data-Driven Business Decisions
According to PwC research, highly data-driven organizations are three times more likely to witness considerable improvement in decision-making. Unfortunately, a whopping 62% of executives still rely more on experience and gut feelings than data to make business decisions.
354. Top 3 Things You Forget When Building Your SaaS Product
While the number of product management roles in the US has grown by more than 30% in two years, according to LinkedIn, the responsibilities of the job are morphing.
355. How to Improve Social Media Campaign Using Data Visualization
Learn what social media data visualization is and why it is important.
356. Understand Data Analytics Framework Using An Example From General Electric Company
The framework will allow you to focus on the business outcomes first and the actions and decisions that enable the outcomes.
357. Digital Transformation Strategy: Dinosaurs, Harpoons, Greek Myths and YOU!
Digital transformation is not one single thing to implement, it is a core alignment with continuous investment in innovation and excellence.
358. A Brief History of Computing and Data Analytics - From Punch Cards to the "Modern Data Stack"
A journey from the origins of computing and data analytics to what we now call the "Modern Data Stack". What comes next?
359. How to Export Metrics from Databricks Serving Endpoint to Datadog
If you are using Databricks serving endpoint, and you wish to export metrics to Datadog, you can face with some challenges in Datadog documentation.
360. Maximizing E-commerce Potential with Refined Data Analytics and Storage Architecture
In this post, I write about how my team carries out refined operations, based on our own Data Management Platform (DMP).
361. Do You Need All This Data?
A “lean data” strategy is necessary for today’s e-commerce businesses to stay nimble, avoid “data muck” and not be bogged down by too much data.
362. Auto-Increment Columns in Databases: A Simple Trick That Makes a Big Difference
An introduction to auto-increment columns in Apache Doris, usage, applicable scenarios, and implementation details.
363. How To Segment Shopify Customer Base with Google Sheets and Google Data Studio
After defining what the RFM analysis is standing for, and how you can apply it to your Customer Base, I want to show you how to apply it on Shopify orders data.
364. 6 Ways to Increase Revenue in 2020 with Market Intelligence Data
Data analytics tools are increasingly being used in businesses, but many people still make critical decisions based on assumptions and guesses. The most common reason for this is the lack of a single, integrated source of information that gives executives accurate and consistent data whenever needed.
365. How To Build a Data-Savvy Brand
Since new-gen tech has enabled companies to mine large sets of structured and unstructured data, the idea of becoming a data-driven company has become the preoccupation of many executives.
366. The Design Work Nobody Posts on Dribbble
Five years into product design at a fintech, the majority of my work is documentation, edge cases, and maintenance.
367. 6 Data Analytics Growth Hacks for SMBs
Data analytics offers you amazing capabilities to grow your business. Leverage the power of these amazing data analytics hacks to reach your business goals.
368. Staying Ahead Consistently with Competitive Pricing Intelligence
Business is looking good, you are making decent profit margins each year, and your customers seem to be happy with your services.
369. How AI and Data Analytics Will Impact The Era of COVID-19
Artificial intelligence (AI) and data analytics are rapidly growing trends in the tech world. With increasing potential for innovation, it is paramount that we stay up to date with all the latest developments in this field. According to MarketsandMarkets, the worldwide artificial intelligence (AI) market will increase from USD 58.3 billion in 2021 to USD 309.6 billion by 2026, at a compound annual growth rate (CAGR) of 39.7 percent over the projected period. It seems that every company wants a piece of this growing pie. By 2022 it is expected that 90% of companies will be using some form of artificial intelligence for data analytics purposes.
370. Is Your Business Suffering from Big Data Burnout? 5 Ways to Democratize Data
With so much data available at your fingertips, if you fail to implement a strong system, your business is at risk of suffering from big data burnout.
371. Alluxio Accelerates Deep Learning in Hybrid Cloud using Intel’s Analytics Zoo powered by oneAPI
This article describes how Alluxio can accelerate the training of deep learning models in a hybrid cloud environment when using Intel’s Analytics Zoo open source platform, powered by oneAPI. Details on the new architecture and workflow, as well as Alluxio’s performance benefits and benchmarks results will be discussed. The original article can be found on Alluxio's Engineering Blog.
372. Six Habits to Adopt for Highly Effective Data
Put your organization on the path to consistent data quality with by adopting these six habits of highly effective data.
373. Data Scientist Careers at Amazon: What You'll Earn, Learn, and Work On
Find out what it means to be a data scientist at Amazon! Their salaries, roles and required experience, types of data positions, and interview process.
374. Optimize Power BI Reporting and Designing
As a data analysis tool, Power BI comes loaded with plenty of report generation and design features. However, do not rely on the default settings of the tool.
375. Year of the Graph Newsletter, April 2020: Graphs Power Scientific Research; Business Cases
Is there life after COVID-19? Of course there is, even though it may be quite different, and it may be hard to get there. But there’s one thing in common in the “before” and “after” pictures: science and technology as the cornerstones of modern society, for better or worse.
376. Stop Guessing What Customers Want With This Analysis Technique
Voice of Customer analysis is powerful and can create important and long-lasting change in your business, but it is not a one-time solution to a problem.
377. A Brief Intro to 8 Ways AI Could Improve Patient Care
How much data does a hospital produce each day? How much information are they capable of storing, analyzing, and sharing with physicians and patients?
378. How to Create a Data Analytics Strategy to Grow Your Business
Are you building a Software-as-a-Service platform? Wondering what data is essential for your business? Time for a Data Analytics Strategy.
379. Enhancing Experiment Sensitivity in B2C: A Robust Framework for Heavy-Tailed Metrics
Boost B2C experiment sensitivity with Cross-Fitted CUPED. Learn how to handle heavy-tailed metrics like ARPU without overfitting. Includes Python code.
380. The Unseen Battleground: An Architect’s Retro on Streaming 1 Billion Minutes of Live Sports
Streaming 1B minutes of live sports: hard choices, scars, and lessons in building real-time, petabyte-scale systems.
381. The Metric Hierarchy Every Subscription Company Needs
I would argue that 99% of companies that are really good at developing tech products do these three things: They have clearly defined metrics that they are t...
382. Can Your Organization's Data Ever Really Be Self-Service?
Self-serve systems are a big priority for data leaders, but what exactly does it mean? And is it more trouble than it's worth?
383. How Did You Become A Data Scientist?
Every data professional has a unique story as to how they entered the field of data science. Here is my career path origin story.
384. Launch Readiness Matters More Than Code
Launch day reveals what you should have built. Launch readiness is everything else.
385. Designing DeFi for People Who Hate DeFi
How I designed Merlin by VALK - a DeFi analytics tool built for traditional finance professionals who don't speak crypto-native language....
386. Hacking Your Analytics: Top Barriers in Harnessing the Power of Data
An infographic to take a look at how to use more of your organization's data with Google Analytics 360 to form solid data based business decisions proactively.
387. Helping Every User Navigate Data - Interview with Startups of the Year Nominee, dScribe
dScribe has been nominated in HackerNoon's annual Startup of the Year awards in Ghent, Belgium. Here's why.
388. How To Choose The Right Business Intelligent Tool
In this blog, we look at strategies for selecting the right BI tool as well as some important things to keep in mind throughout the process.
389. Practical Tips to Improve Customer Experience with Data
According to a report, almost 70% of companies compete on customer experience.
390. Unveiling the Secrets of Satoshiverse: A Q&A with the Founders
Satoshiverse is an upcoming web3 game made in Unreal Engine 5, and in an exclusive interview, the developers tell us how W3W's platform has helped development.
391. "Writing Routine Needs to be Fluid and Adaptable": Meet the Writer Alex Jivov, CEO of Hopeful Inc.
Meet Alex Jivov, CEO of Hopeful Inc, and a person of many interests. Giving us more details into how a former journalist approaches the new writing routine.
392. What It Takes to Design for 5 Million Crypto Users
How I redesigned Merlin by VALK for the Ledger Live integration - two audiences, two design systems, and a compressed timeline....
393. Ensuring Security in Your SaaS Applications [An Overview]
Enterprises are constantly faced with the task of balancing the advantages of productivity gains and lower costs against significant compliance and security concerns as they move their data and applications to the cloud.
394. Metrics that Matter: The Essential 5 Metrics for B2B SaaS Companies
The go-to B2B saas metrics to drive analytics for growth and higher performance.
395. AI in Marketing: How Can It Help You?
With AI leaving or expected to leave its stamp on every business and profession known to mankind, is it possible that it would not leave its imprint---
396. Unleash the Power of Cohort Analysis & CLTV Modeling in Analytics
Discover the key differences between Cohort Analysis and CLTV modeling. Learn how top companies use both to drive retention, revenue, and strategic decisions.
397. How Big Data in IoV Analytics Prevents Accidents
As electric vehicles surge, the Internet of Vehicles (IoV) is reshaping the automotive landscape.
398. 5 Most Common Data Quality Issues For Business
With the advent of data socialization and data democratization, many organizations organize, share and make information available to all employees in an efficient manner. While most organizations benefit from liberal use of such a source of information available to their employees, others struggle with the quality of the data they use.
399. What Working on an Analytics Product Can Teach Us About Data
The ubiquity of analytics hides potential complexity underneath, especially when you start to consider products where the analytics are more front and centre.
400. The Qnum Analytics Team On Turning A Side Gig Into A Full Time Business
The team behind Qnum Analytics, tool leveraging AI to help businesses fix leaky inventory buckets, shares their origin story and what makes their team special.
401. A Guide to Understanding Pandas Series Labeled and Unlabeled Data Structures
Learn how to create and manipulate pandas Series, versatile data structures for efficient data handling in Python. Explore labeled and unlabeled formats,
402. The HackerNoon Newsletter: MCP vs A2A - A Complete Deep Dive (8/24/2025)
8/24/2025: Top 5 stories on the HackerNoon homepage!
Thank you for checking out the 402 most read blog posts about Data Analytics on HackerNoon.
Visit the /Learn Repo to find the most read blog posts about any technology.
