Data Ingestion: The First Step to Generate Valuable Business Insights

February 8, 2023 Rasvan Grigorescu

Your business may accumulate a lot of data from multiple sources. But if the sources are not connected, it’s difficult to leverage that data to its full value, particularly as your business grows. The analytics within each transactional system, such as Microsoft Dynamics, provides limited reporting capabilities out of the box.

Perhaps you have deployed a Microsoft Dynamics ERP solution along with other enterprise applications to assist with sales, marketing and operations. To tap into the synergy of combining these data sources, you need to extract the data, ingest it into a centralized data lake, and then enrich the data—where a solution like Microsoft Power BI can generate more sophisticated reports. And those insights become even more powerful when you connect additional data sources.

Just Having the Data Is Not Enough

This blog is the second in our series on Leveraging Data to Generate Business Insights. In the first blog, we discussed why data management is important. Just having data doesn't make the data a business enabler. To help your company increase revenue, lower operational costs, and improve employee productivity, you need to deploy processes for ingesting, transforming, modeling, analyzing, visualizing, collaborating, and governing the data.

For this blog, we examine data ingestion, the initial phase in the data management lifecycle. One of the keys is to design an ingestion process that optimizes the speed of the data loads—striking a balance between loading the data and querying the data. Many businesses focus on query speed because end-users need to get insights fast. But if data loading is a cumbersome, slow process, it can affect the downstream steps.

Key Questions for Designing the Ingestion Process

As you design your data ingestion process, there are several questions to answer:

How many data sources and where is the data coming from?

The source of operational and transactional data could be internal systems such as Dynamics 365 Business Central, Finance and Operations, or Customer Engagement. You may have other business applications such as an e-commerce website or a warehousing application. There could also be external sources, perhaps market data from third-party providers such as Nielsen, Gartner, or IDC, as well as sales data from distributors and business partners. You might collect customer sentiment data from social networks.

Other potential data sources are your on-premises legacy systems that store historical data. Perhaps you are sunsetting Dynamics AX or GP as you migrate to a cloud-based Dynamics 365 ERP system or starting to use SaaS applications and don't want to lose data from your legacy systems that might span multiple years. You might also collect streaming data from your production machines on the manufacturing floor, or from environmental or building sensors on an IoT (Internet of Things) network.

How often is data refreshed?

Is your data uploaded in batches, in real time, or near real time? Perhaps you’re on an hourly, daily, weekly or monthly schedule—or there’s a mix among your various data sources.

It’s also important to consider if you need incremental updates (deltas) or full updates. You might want data snapshots to compare data from this year to last year at the same point in time, for example.

Where and how is data stored?

At many businesses, data from different sources is siloed in multiple disparate systems, and that creates a big challenge with respect to getting data insights. Some businesses solve this problem by storing information in a data lake, which provides the benefit of an open format where all types of data can be stored, and open APIs can access the data. A data lake also scales easily, and the cost is usually low.

Conversely, data warehouses and traditional databases are typically more established and have structured, high-quality data, and fine-grained security and governance. Data warehouses are usually more expensive than data lakes and use a closed proprietary format.

An alternative is the data lakehouse—a hybrid between a data lake and a data warehouse that gives you the best of both worlds. A date lakehouse combines the open format of data lakes with the structure of SQL (Structured Query Language) warehouses. You get the rigor of the warehouse with the openness and scalability of the lake. This approach also offers one single data architecture for many workloads.

How is the data formatted?

A structured tabular format—with rows and columns—is easy for most end-users to make sense of. Another common format is CSV (comma-separated values), to which you can export from almost any data source, making CSV tables good for data interchanges.

There’s also semi-structured data, such as text or chat transcriptions. They're not totally unstructured as you can usually recognize words, sentences and phrases. Unstructured data usually comes in the form of pictures, videos or audio.

Then there are the advanced modern formats, which are becoming popular. Apache Parquet is an open source, column-oriented data file format that works well for data lakes and facilitates point-in-time data storage. Apache Avro is a serialization format for record data and a good choice for streaming data. JSON, the JavaScript Object Notation format, is text-based and commonly used for transmitting data in web applications.

How is data accessed and is cleansing required? Text Box

Most of the time, data management systems access databases and data warehouses using SQL. For data lakes, open APIs are common, but you can also use Python. Most businesses use a combination of open APIs and SQL, depending on the data source.

Data is rarely usable as-is, so you will likely need to cleanse your data as you ingest it, which will greatly increase the data quality. A key consideration here is whether to cleanse the data prior to or after loading—we will address this in our next blog on data transformation.

Answers Provide Better Understanding of the Required Design

In addition to answering the questions above, also consider how robust the ingestion process needs to be. For example, if an ingestion workflow fails, you may want to program it to retry automatically and then alert you when it’s completed or if it keeps failing. Security controls are also a key consideration.

By answering these questions, your organization will better understand how to set the design goals and the key considerations for the data ingestion process. You will also identify the tools and methods you need to utilize in this phase.

If you need help leveraging your data to generate business insights, Western Computer is here to help. Contact us today to learn more about our Microsoft solutions and how they can help you aggregate, analyze and govern all your data sources.

About the Author

Senior Data Architect Rasvan Grigorescu has more than 20 years of leadership and technical experience in driving business value from data.
More Content by Rasvan Grigorescu

Would you like to speak to a Microsoft Dynamics expert? Contact us today:

First Name

Last Name

Product I'd like a demo of:

Opt-in To Future Emails

Thank you, we will be in touch shortly!

Error - something went wrong!

Return to Home

Welcome to our Resource Library!

Data Ingestion: The First Step to Generate Valuable Business Insights

About the Author

Previous Article

Next Article

Data Ingestion: The First Step to Generate Valuable Business Insights

About the Author

Previous Article

Next Article

Most Recent Articles

See 3 signs your ERP budget is off track. Learn how Gyde365 Qualify delivers a broad-range estimate for FY26 planning in minutes.

See how Dynamics 365 Finance & Supply Chain drives ROI for CFOs, resilience for COOs, and scalability for CIOs — all on one ERP platform.

Struggling with ERP budgeting? Learn why ERP budgets derail and how Gyde365 Qualify gives you a defendable FY26 estimate in minutes.

Explore the risks of over-automation in sales and how Dynamics 365 Sales helps leaders find the right balance between AI efficiency and human connection.

Boost productivity with Business Central Copilot. See 3 ways AI and Microsoft Copilot cut workload for IT, operations, and power users

Budget season is here. Use a free, self-guided pricing assessment to get a broad-range estimate for a modern W&S platform—built for distributors and suppliers

Start FY26 planning with a free, self-guided assessment that turns your distribution flow into a budget-ready estimate.

Budget season is here. Use a free, self-guided pricing assessment to get a broad-range estimate for a modern manufacturing platform—built for growth

Discover how AI-driven CRMs like Dynamics 365 Sales are transforming sales leadership with automation and intelligence.

Stay compliant with IRS 1099 reporting using Microsoft Dynamics 365 Business Central. Learn updates, automation, and e-filing benefits.

Automate order fulfillment to cut chargebacks, speed deliveries, and give CPG distributors the accuracy retailers demand.

Discover how 365WineTrade empowers wine & spirits businesses with real-time inventory insights to optimize purchasing and boost margins.

See how Microsoft Copilot in Dynamics 365 F&SCM transforms training, workspaces, and adoption with AI-powered tools for finance and supply chain teams.

Stay competitive in the food & beverage industry by automating trade promotions, tracking deductions, and ensuring retail compliance across every channel

Take control of manufacturing quality. Reduce defects, ensure compliance, and build customer trust with automated tools and real-time monitoring.

): Discover how AI in Dynamics 365 Supply Chain Management improves demand planning, inventory visibility, and efficiency across the supply chain.

Discover how Microsoft 365 Copilot with GPT-5 transforms productivity. Learn what Microsoft AI Copilot is and how it helps businesses work smarter.

See 3 signs your ERP budget is off track. Learn how Gyde365 Qualify delivers a broad-range estimate for FY26 planning in minutes.