Transitioning from Traditional Data Warehousing to Modern Web Analytics

From Google to BAM and Business Intelligence: Exploring the Spectrum of Web Analytics
From Google to BAM and Business Intelligence: Exploring the Spectrum of Web Analytics

the journey from traditional data warehousing to modern web analytics has been transformative. Initially, data integration tools were pivotal in extracting and transforming structured data from line-of-business (LOB) operational applications, loading it into data warehouses for business intelligence (BI) reporting and analysis. Today, the analytics arena has expanded to include content analytics, event analytics, and web analytics, with collaborative analytics emerging as a new frontier. This article delves into the purpose and functionality of web analytics, the tools and techniques employed, and the future of data analytics in a collaborative environment.

The Evolution of Data Analytics

From Traditional Data Warehousing to Modern Analytics

In the early days, data warehousing was the cornerstone of business intelligence. Data integration tools played a crucial role in extracting and transforming structured data from traditional LOB operational applications. This data was then loaded into a data warehouse, where BI reporting and analysis tools could process it. However, the analytics landscape has since evolved, with various departments now deploying different types of analytics, including content analytics, event analytics, and web analytics.

The Rise of Collaborative Analytics

The development of collaborative and social computing tools is paving the way for collaborative analytics. This shift is significant because many individuals building these analytical solutions may lack deep knowledge of BI and data warehousing. As a result, it is unrealistic for the BI group to assume they can fully integrate this influx of information into a data warehousing environment. Web analytics serves as a prime example of this challenge, as it can be developed by various groups within an organization.

The Purpose of Web Analytics

Understanding Web Analytics

According to the Web Analytics Association (WAA), web analytics involves the collection, measurement, analysis, and reporting of internet data to optimize web usage. The WAA establishes standard metrics that web analytics products should support, such as page views, visits, unique visitors, new visitors, returning visitors, clickthroughs, and conversions. These metrics are essential for identifying website visitors, understanding their behavior, and measuring the success of their visits, such as purchasing a product or service.

Tactical and Strategic Decision-Making

WAA metrics provide after-the-fact summaries of past events, primarily intended for tactical and strategic decision-making. However, for operational decision-making, such as fraud detection or real-time marketing campaigns, different tools are required. Identifying visitors can be challenging and often requires the use of cookies or customer relationship management (CRM) tools to gather additional data.

Optimizing Web Performance

Overall, web analytics offers valuable insights into website usage, visitor behavior, and online performance. These insights can help businesses optimize their online presence and improve customer engagement.

How Web Analytics Products Function

Data Collection Methods

Web data can be collected using two primary methods: page tagging and log file analysis.

Page Tagging

Page tagging involves adding additional code, often written in JavaScript, to a web page to inform a third-party server when the page is rendered by a web browser.

Log File Analysis

Log files generated by the web server managing a website can be analyzed to gather data. While some products support network sniffing, it will not be discussed here.

Prominent Web Analytics Tools

Google Analytics

Google Analytics is a prominent example of a product that utilizes page tagging. It is a free software-as-a-service (SaaS) offered by Google, which generates comprehensive visitor metrics for a website, aimed at marketers instead of webmasters. The product is useful for measuring the effectiveness of marketing campaigns utilizing Google's AdWords feature. Even websites with less than 5 million page views per month can use the service, even without an AdWords account. The Google Analytics JavaScript code collects visitor data and sends it back to Google data collection servers. The servers process the data periodically and generate reports that the website owner can access on-demand. Google also provides the fee-based Urchin Software for in-house use.

Other Tools

Other SaaS and in-house products that compete with Google Analytics include Coremetrics, Omniture (recently acquired by Adobe) SiteCatalyst, Unica NetInsight, WebTrends Analytics, and Yahoo Web Analytics. CMS Watch offers an excellent report for purchase comparing these and other web analytics products, and their website has a free report appendix documenting how these products support WAA metrics.

Considerations for Purchasing Web Analytics Products

When purchasing a web analytics product, it is crucial to consider its ability to handle web pages with dynamic content that includes Rich Internet Applications (RIA) created with technologies such as Ajax and Adobe Flash. The capability to track RSS syndication readership and mobile users may also be essential for some organizations. Some vendors, such as SeeWhy, provide specific applications for web marketing. All of the products mentioned above support page tagging, while a few also support log file processing.

Comparing Tagging and Log File Approaches

The comparison table below highlights the differences between the tagging and log file approaches.

Leveraging Web Data in a Data Warehousing Environment

Integrating Web Data with Enterprise Data

Log files can serve as an ideal data source for a data warehousing environment, allowing web data to be correlated with other types of enterprise data. Given the volume of data involved, some filtering and consolidation may be necessary before loading the log data into a data warehouse. This can be done using standard data integration tools that support flat files or using technologies like Hadoop MapReduce.

Real-Time and Near-Real-Time Web Analytics

When real-time or near-real-time web analytics are required, there are two alternative approaches available from vendors.

Business Activity Monitoring (BAM)

BAM tracks and analyzes business transactions generated by web interactions as they pass through operational systems. BAM is useful for analyzing a continuous stream of business transactions and generating real-time reports and dashboards.

Complex Event Processing (CEP)

For more complex processing of transaction and event streams, products that support complex event processing (CEP) can be used. CEP solutions can analyze and correlate multiple streams of current and historical data, identify patterns and trends, and predict potential outcomes. Examples of vendors in this area include Aleri, IBM (WebSphere Business Events, InfoSphere Streams), Oracle (Oracle CEP), Tibco (BusinessEvents), and Truviso. Note that some vendors use the term business event processing (BEP) instead of CEP, while others use terms such as continuous intelligence and continuous analytics.

Conclusion

In summary, the transition from traditional data warehousing to modern web analytics has been driven by the need for more comprehensive and real-time insights into web performance. Web analytics tools, such as Google Analytics, Coremetrics, and Omniture SiteCatalyst, provide valuable metrics that help businesses optimize their online presence and improve customer engagement. As the landscape of analytics continues to evolve, the integration of web data with enterprise data and the adoption of real-time analytics tools will be crucial for staying competitive. Embrace the power of web analytics and leverage the tools and techniques discussed to drive your business forward.

FAQ Section

  1. What is web analytics?

    • Web analytics is the process of collecting, measuring, analyzing, and reporting internet data to optimize web usage.

  2. What are the standard metrics supported by web analytics products?

    • Standard metrics include page views, visits, unique visitors, new visitors, returning visitors, clickthroughs, and conversions.

  3. What is the difference between page tagging and log file analysis?

    • Page tagging involves adding code to a web page to inform a third-party server, while log file analysis involves analyzing data from web server log files.

  4. What is Google Analytics?

    • Google Analytics is a free SaaS offered by Google that generates comprehensive visitor metrics for a website, aimed at marketers.

  5. What are some alternatives to Google Analytics?

    • Alternatives include Coremetrics, Omniture SiteCatalyst, Unica NetInsight, WebTrends Analytics, and Yahoo Web Analytics.

  6. What is business activity monitoring (BAM)?

    • BAM tracks and analyzes business transactions generated by web interactions as they pass through operational systems.

  7. What is complex event processing (CEP)?

    • CEP solutions analyze and correlate multiple streams of current and historical data to identify patterns and trends.

  8. How can web data be integrated with enterprise data?

    • Web data can be integrated with enterprise data using standard data integration tools that support flat files or technologies like Hadoop MapReduce.

  9. What are the challenges in identifying website visitors?

    • Identifying visitors can be challenging and often requires the use of cookies or CRM tools to gather additional data.

  10. What is the purpose of WAA metrics?

    • WAA metrics provide after-the-fact summaries of past events, primarily intended for tactical and strategic decision-making.