AWS Partner Network (APN) Blog
Category: Analytics
How Mactores Tripled Performance by Migrating from Oracle to Amazon Redshift with Zero Downtime
Mactores used a five-step approach to migrate, with zero downtime, a large manufacturing company from an Oracle on-premises data warehouse to Amazon Redshift. The result was lower total cost of ownership and triple the performance for dependent business processes and reports. The migration tripled the customer’s performance of reports, dashboards, and business processes, and lowered TCO by 30 percent. Data refresh rates dropped from 48 hours to three hours.
Building a Data Processing and Training Pipeline with Amazon SageMaker
Next Caller uses machine learning on AWS to drive data analysis and the processing pipeline. Amazon SageMaker helps Next Caller understand call pathways through the telephone network, rendering analysis in approximately 125 milliseconds with the VeriCall analysis engine. VeriCall verifies that a phone call is coming from the physical device that owns the phone number, and flags spoofed calls and other suspicious interactions in real-time.
Q/Kdb+ on AWS Lambda: Serverless Time-Series Analytics at Scale
AWS Lambda is a particularly desirable environment for HPC applications because of the high level of parallelization it supports. Kx, an APN Advanced Technology Partner, created a q/kdb+ runtime that enables financial institutions to optimize their applications for the serverless environment of AWS Lambda. Q/kdb+ has been widely adopted by the financial services industry because of its small footprint, high performance, and high volume time-series analytics capabilities.
Monitoring Your Palo Alto Networks VM-Series Firewall with a Syslog Sidecar
By hosting a Palo Alto Networks VM-Series firewall in an Amazon VPC, you can use AWS native cloud services—such as Amazon CloudWatch, Amazon Kinesis Data Streams, and AWS Lambda—to monitor your firewall for changes in configuration. This post explains why that’s desirable and walks you through the steps required to do it. You now have a way to monitor your Palo Alto Networks firewall that is very similar to how you monitor your AWS environment with AWS Config.
Accelerating Machine Learning with Qubole and Amazon SageMaker Integration
Data scientists creating enterprise machine learning models to process large volumes of data spend a significant portion of their time managing the infrastructure required to process the data, rather than exploring the data and building ML models. You can reduce this overhead by running Qubole data processing tools and Amazon SageMaker. An open data lake platform, Qubole automates the administration and management of your resources on AWS.
How to Use AWS Glue to Prepare and Load Amazon S3 Data for Analysis by Teradata Vantage
Customers want to use Teradata Vantage to analyze the data they have stored in Amazon S3, but the AWS service that prepares and loads data stored in S3 for analytics, AWS Glue, does not natively support Teradata Vantage. To use AWS Glue to prep and load data for analysis by Teradata Vantage, you need to rely on AWS Glue custom database connectors. Follow step-by-step instructions and learn how to set up Vantage and AWS Glue to perform Teradata-level analytics on the data you have stored in Amazon S3.
Running SQL on Amazon Athena to Analyze Big Data Quickly and Across Regions
Data is the lifeblood of a digital business and a key competitive advantage for many companies holding large amounts of data in multiple cloud regions. Imperva protects web applications and data assets, and in this post we examine how you can use SQL to analyze big data directly, or to pre-process the data for further analysis by machine learning. You’ll also learn about the benefits and limitations of using SQL, and see examples of clustering and data extraction.
Powering Enterprise Analytics at Scale Using Teradata Vantage on AWS
The amount and variety of existing and newly-generated data in today’s connected world is unparalleled. As this growth continues, so does the opportunity for organizations to extract real value from their data. Teradata Vantage is a modern analytics platform that combines open source and commercial analytic technologies. It can drive autonomous decision-making by helping you to operationalize insights, solve complex business problems, and enable descriptive, predictive, and prescriptive analytics.
Lower TCO and Increase Query Performance by Running Hive on Spark in Amazon EMR
Learn how Mactores helped Seagate Technology to use Apache Hive on Apache Spark for queries larger than 10TB, combined with the use of transient Amazon EMR clusters leveraging Amazon EC2 Spot Instances. It was imperative for Seagate to have systems in place to ensure the cost of collecting, storing, and processing data did not exceed their ROI. Moving to Hive on Spark enabled Seagate to continue processing petabytes of data at scale with significantly lower TCO.
Analyze Streaming Data from Amazon Managed Streaming for Apache Kafka Using Snowflake
When streaming data comes in from a variety of sources, organizations should have the capability to ingest this data quickly and join it with other relevant business data to derive insights and provide positive experiences to customers. Learn how you can build and run a fully managed Apache Kafka-compatible Amazon MSK to ingest streaming data, and explore how to use a Kafka connect application to persist this data to Snowflake. This enables businesses to derive near real-time insights into end users’ experiences and feedback.