Category: Amazon QuickSight
New – S3 Sync capability for EC2 Systems Manager: Query & Visualize Instance Software Inventory
It is now essential, with the fast paced lives we all seem to lead, to find tools to make it easier to manage our time, our home, and our work. With the pace of technology, the need for technologists to find management tools to easily manage their systems is just as important. With the introduction of Amazon EC2 Systems Manager service during re:Invent 2016, we hoped to provide assistance with the management of your systems and software.
If are not yet familiar with the Amazon EC2 Systems Manager, let me introduce this capability to you. EC2 Systems Manager it is a management service that helps to create system images, collect software inventory, configure both Windows and Linux operating systems, as well as, apply Operating Systems patches. This collection of capabilities allows remote and secure administration for managed EC2 instances or hybrid environments with on-premise machines configured for Systems Manager. With this EC2 service capability, you can additionally record and regulate the software configuration of these instances using AWS Config.
Recently we have added another feature to the inventory capability of EC2 Systems Manager to aid you in the capture of metadata about your application deployments, OS and system configurations, Resource Data Sync aka S3 Sync. S3 Sync for EC2 Systems Manager allows you to aggregate captured inventory data automatically from instances in different regions and multiple accounts and store this information in Amazon S3. With the data in S3, you can run queries against the instance inventory using Amazon Athena, and if you choose, use Amazon QuickSight to visualize the software inventory of your instances.
Let’s look at how we can utilize this Resource Data Sync aka S3 Sync feature with Amazon Athena and Amazon QuickSight to query and visualize the software inventory of instances. First things first, I will make sure that I have the Amazon EC2 Systems Manager prerequisites completed; configuration of the roles and permissions in AWS Identity and Access Management (IAM), as well as, the installation of the SSM Agent on my managed instances. I’ll quickly launch a new EC2 instance for this Systems Manager example.
Now that my instance has launched, I will need to install the SSM Agent onto my aws-blog-demo-instance. One thing I should mention is that it is essential that your IAM user account has administrator access in the VPC in which your instance was launched. You can create a separate IAM user account for instances with EC2 Systems Manager, by following the instructions noted here: http://docs.thinkwithwp.com/systems-manager/latest/userguide/sysman-configuring-access-policies.html#sysman-access-user. Since I am using an account with administrative access, I won’t need to create an IAM user to continue installing the SSM Agent on my instance.
To install the SSM Agent, I will SSH into my instance, create a temporary directory, and pull down and install the necessary SSM Agent software for my Amazon Linux EC2 instance. An EC2 instance based upon a Windows AMI already includes the SSM Agent so I would not need to install the agent for Windows instances.
To complete the aforementioned tasks, I will issue the following commands:
mkdir /tmp/ssm
cd /tmp/ssm
sudo yum install -y https://s3.amazonaws.com/ec2-downloads-windows/SSMAgent/latest/linux_amd64/amazon-ssm-agent.rpm
You can find the instructions to install the SSM Agent based upon the type of operating system of your EC2 instance in the Installing SSM Agent section of the EC2 Systems Manager user guide.
Now that I have the Systems Manager agent running on my instance, I’ll need to use a S3 bucket to capture the inventory data. I’ll create a S3 bucket, aws-blog-tew-posts-ec2, to capture the inventory data from my instance. I will also need to add a bucket policy to ensure that EC2 Systems Manager has permissions to write to my bucket. Adding the bucket policy is simple, I select the Permissions tab in the S3 Console and then click the Bucket Policy button. Then I specify a bucket policy which gives the Systems Manager the ability to check bucket permissions and add objects to the bucket. With the policy in place, my S3 bucket is now ready to receive the instance inventory data.
To configure the inventory collection using this bucket, I will head back over to the EC2 console and select Managed Resources under Systems Manager Shared Resources section, then click the Setup Inventory button.
In the Targets section, I’ll manually select the EC2 instance I created earlier from which I want to capture the inventory data. You should note that you can select multiple instances for which to capture inventory data if desired.
Scrolling down to the Schedule section, I will choose 30 minutes for the time interval of how often I wish for inventory metadata to be gathered from my instance. Since I’m keeping the default Enabled value for all of the options in the Parameters section, and I am not going to write the association logs to S3 at this time, I only need to click the Setup Inventory button. When the confirmation dialog comes up noting that the Inventory has been set up successfully, I will click the Close button to go back to the main EC2 console.
Back in the EC2 console, I will set up my Resource Data Sync using my aws-blog-tew-posts-ec3 S3 bucket for my Managed Instance by selecting the Resource Data Syncs button.
To set up my Resource data, I will enter my information for the Sync Name, Bucket Name, Bucket Prefix, and the Bucket Region that my bucket is located. You should also be aware that the Resource Data Sync and the sync S3 target bucket can be located in different regions. Another thing to note is that the CLI command for completing this step is displayed, in case I opt to utilize the AWS CLI for creating the Resource Data Sync. I click the Create button and my Resource Data Sync setup is complete.
After a few minutes, I can go to my S3 bucket and see that my instance inventory data is syncing to my S3 bucket successfully.
With this data syncing directly into S3, I can take advantage of the querying capabilities of the Amazon Athena service to view and query my instance inventory data. I create a folder, athenaresults, within my aws-blog-tew-posts-ec2 S3 bucket, and now off to the Athena console I go!
In the Athena console, I will change the Settings option to point to my athenaresults folder in my bucket by entering: s3://aws-blog-tew-posts-ec2/athenaresults. Now I can create a database named tewec2ssminventorydata for capturing and querying the data sent from SSM to my bucket, by entering in a CREATE DATABASE SQL statement in the Athena editor and clicking the Run Query button.
With my database created, I’ll switch to my tewec2ssminventorydata database and create a table to grab the inventory application data from the S3 bucket synced from the Systems Manager Resource Data Sync.
As the query success message notes, I’ll run the MSCK REPAIR TABLE tew_awsapplication command to partition the newly created table. Now I can run queries against the inventory data being synced from the EC2 Systems Manager to my Amazon S3 buckets. You can learn more about querying data with Amazon Athena on the product page and you can review my blog post on querying and encrypting data with Amazon Athena.
Now that I have query capability of this data it also means I can use Amazon QuickSight to visualize my data.
If you haven’t created an Amazon QuickSight account, you can quickly follow the getting started instructions to setup your QuickSight account. Since I already have a QuickSight account, I’ll go to the QuickSight dashboard and select the Manage Data button. On my Your Data Sets screen, I’ll select the New data set button.
Now I can create a dataset from my Athena table holding the Systems Manager Inventory Data by selecting Athena as my data source.
This takes me through a series of steps to create my data source from the Athena tewec2ssminventorydata database and the tew_awsapplication table.
After choosing Visualize to create my data set and analyze the data in the Athena table, I am now taken to the QuickSight dashboard where I can build graphs and visualizations for my EC2 System Manager inventory data.
Adding the applicationtype field to my graph, allows me to build a visualization using this data.
Summary
With the new Amazon EC2 Systems Manager Resource Data Sync capability to send inventory data to Amazon S3 buckets, you can now create robust data queries using Amazon Athena and build visualizations of this data with Amazon QuickSight. No longer do you have to create custom scripts to aggregate your instance inventory data to an Amazon S3 bucket, now this data can be automatically synced and stored in Amazon S3 allowing you to keep your data even after your instance has been terminated. This new EC2 Systems Manager capability also allows you to send inventory data to S3 from multiple accounts and different regions.
To learn more about Amazon EC2 Systems Manager and EC2 Systems Manager Inventory, take a look at the product pages for the service. You can also build your own query and visualization solution for the EC2 instance inventory data captured in S3 by checking out the EC2 Systems Manager user guide on Using Resource Data Sync to Aggregate Inventory Data.
In the words of my favorite Vulcan, “Live long, query and visualize and prosper” with EC2 Systems Manager.
– Tara
New – Amazon QuickSight Enterprise Edition
When I first wrote about Amazon QuickSight in 2015 (Amazon QuickSight – Fast & Easy to Use Business Intelligence for Big Data at 1/10th the Cost of Traditional Solutions), I mentioned that we would make the service available in Standard and Enterprise Editions.
Enterprise Edition
We launched the Standard Edition of Amazon QuickSight last month. Today we are launching the Enterprise Edition. Building on the capabilities of the Standard Edition, the Enterprise Edition adds Active Directory Integration and Encryption at Rest.
The Enterprise Edition supports user authentication via AWS Managed Microsoft Active Directory (AD). This allows your users to sign on to QuickSight using the credentials already in your AWS-hosted Microsoft AD or in a trusted on-premises AD. Either way, the Single Sign On (SSO) experience allows your users to get started more quickly and with less administrative overhead.
If you are responsible for administering your organization’s base of QuickSight users, you can bring thousands of users on board and manage their permissions with just a few clicks. You can manage this user base using your existing toolset and with respect to your existing governance policies.
Here’s how it all fits together:
QuickSight relies on SPICE (Super-fast, Parallel, In-memory Calculation Engine) for highly scalable ad hoc analytics. The Enterprise Edition of QuickSight encrypts data at rest within SPICE using keys managed by Amazon in order to provide an additional layer of protection for your data.
Take Enterprise Edition for a Spin
From the administrative side, setting up the Enterprise Edition of QuickSight is really easy, and requires that you sign in as an IAM user with the requisite set of permissions (visit Sign up for Amazon QuickSight with an Existing AWS Account and scroll down to Set Your IAM Policy to learn more).
You select the Enterprise Edition, choose the AWS Managed AD Directory that describes your user community, and authorize access to the directory. Then you add (if not already present) an alias for the directory and use this as the QuickSight account name. Finally, you pick the AD groups within the Managed AD or trusted forest and enable them for QuickSight access.
After you have completed the sign up process, users in the designated groups will be able to log in to QuickSight using the QuickSight account name (directory alias) and their existing AD credentials.
Password restrictions, timeouts, and user management are all handled through the appropriate AD (AWS or on-premises) settings in accord with your organization’s policies. You can manage group membership using your existing tools, adding and removing users and assigning user & admin roles as needed.
Pricing and Availability
Amazon QuickSight Enterprise Edition is now available in the US East (Northern Virginia) Region, with SPICE capacity available in the US East (Northern Virginia), US West (Oregon), and EU (Ireland) Regions. Pricing starts at $18 per month per user and includes 10 GB of SPICE capacity, which is pooled across all of the QuickSight users on the account (the single user QuickSight free tier and the 60 day, 4 user free trial apply to the Enterprise Edition as well). To learn more, visit the QuickSight Pricing page.
If you have a Microsoft AD instance provisioned in the US East (Northern Virginia) Region, you can take advantage of the free tier and free trial, and get started today at no cost.
Amazon Kinesis- Setting up a Streaming Data Pipeline
Ray Zhu from the Amazon Kinesis team wrote this great post about how to set up a streaming data pipeline. He carefully shows you step by step how he set it all up and how you can do it too.
-Ana
Consumer demand for better experiences is ever increasing today. Companies across different industry segments are looking for ways to differentiate their products and services. Data is a key ingredient for providing differentiated products and services, and this is no longer a secret but rather a well adopted practice. Almost all companies at meaningful size are using some sort of data technologies, which means being able to collect and use data is no longer enough as a differentiating factor. Then what? How fast you can collect and use your data becomes the key to stay competitive.
Streaming data technologies shorten the time to analyze and use your data from hours and days to minutes and seconds. Let’s walk through an example of using Amazon Kinesis Firehose, Amazon Redshift, and Amazon QuickSight to set up a streaming data pipeline and visualize Maryland traffic violation data in real time.
Data Flow Overview
Step 1 Set up Redshift database and table
In this step, I’ll set up a Redshift table for Kinesis Firehose to continuously load streaming data into. I first start a single node Redshift cluster and name it “raystreaming.”
Under “Additional Configuration”, I make the cluster publicly accessible so that Kinesis Firehose and QuickSight can connect to my cluster.
After reviewing all configurations, I click on “Launch Cluster”.
Once the cluster is active, I go to the cluster’s VPC Security Groups to add inbound access for Kinesis Firehose service IPs and outbound access for QuickSight service IPs.
Kinesis Firehose service IPs:
US East (N. Virginia) | 52.70.63.192/27 |
US West (Oregon) | 52.89.255.224/27 |
EU (Ireland) | 52.19.239.192/27 |
QuickSight service IPs:
US East (N. Virginia) | 52.23.63.224/27 |
US West (Oregon) (us-west-2) | 54.70.204.128/27 |
EU (Ireland) (eu-west-1) | 52.210.255.224/27 |
Now the cluster is setup and configured. I’ll use a JDBC tool and the SQL statement below to create a table for storing Maryland traffic violation data.
create table TrafficViolation( dateofstop date, timeofstop timestamp, agency varchar(100), subagency varchar(100), description varchar(300), location varchar(100), latitude varchar(100), longtitude varchar(100), accident varchar(100), belts varchar(100), personalinjury varchar(100), propertydamage varchar(100), fatal varchar(100), commlicense varchar(100), hazmat varchar(100), commvehicle varchar(100), alcohol varchar(100), workzone varchar(100), state varchar(100), veichletype varchar(100), year varchar(100), make varchar(100), model varchar(100), color varchar(100), violation varchar(100), type varchar(100), charge varchar(100), article varchar(100), contributed varchar(100), race varchar(100), gender varchar(100), drivercity varchar(100), driverstate varchar(100), dlstate varchar(100), arresttype varchar(100), geolocation varchar(100));
Step 2 Set up Kinesis Firehose delivery stream
In this step, I’ll set up a Kinesis Firehose delivery stream to continuously deliver data to the “TrafficViolation” table created above.
I name my Firehose delivery stream “rayredshiftstream”. Under destination configurations, I choose “Amazon Redshift” as destination and configure an intermediate S3 bucket. Kinesis Firehose will first load my streaming data to this intermediate buckets and then COPY it to Redshift. Loading data from S3 to Redshift is efficient and preserves resources on Redshift for queries. Also, I always have a backup of my data in S3 for other batch processes or in case my Redshift cluster is not accessible (e.g. under maintenance).
Subsequently, I enter the Redshift cluster, database, and table names along with Redshift user name and password. This user needs to have Redshift INSERT permission. I also specify “json ‘auto’” under COPY options to parse JSON formatted sample data.
I set retry duration to 30 seconds. In cases when data load to my Redshift cluster fails, Kinesis Firehose will retry for 30 seconds. The failed data is always in the intermediate S3 bucket for backfill. At the bottom, the exact COPY command Kinesis Firehose will use is generated for testing purposes.
On the next page, I specify buffer size and buffer interval. Kinesis Firehose buffers streaming data to a certain size or for a certain period of time before loading it to S3. Kinesis Firehose’s buffering feature reduces S3 PUT requests and cost significantly and generates relatively larger S3 object size for efficient data load to Redshift. I’m using the smallest buffer size (1MB) and shortest buffer interval (60 seconds) in this example in order to have data delivered sooner.
You can also optionally configure Kinesis Firehose to compress the data in GZIP format before loading it to S3 and use a KMS key to encrypt the data in S3. In this example, I configure my data to be uncompressed and unencrypted. Please note that if you enable GZIP compression, you’ll also need to add “gzip” under Redshift COPY options.
I also enable error logging for Kinesis Firehose to log any delivery errors to my CloudWatch Log group. The error messages are viewable from Kinesis Firehose console as well and are particularly useful for troubleshooting purpose.
Finally, I configure a default IAM role to allow Kinesis Firehose to access the resources I configured in the delivery stream.
After reviewing all configurations, I click on “Create Delivery Stream”.
Step 3 Send data to Kinesis Firehose delivery stream
Now my Firehose delivery stream is set up and pointing to my Redshift table “TrafficViolation”. In this example, I’m using the Traffic Violations dataset from US Government Open Data. I use the Kinesis Firehose sample from AWS Java SDK to parse records from local csv file and send each record to my delivery stream.
In real streaming use cases, you can imagine that each data record is pushed to the delivery stream from police officer’s cellular devices through Firehose’s PutRecord() or PutRecordBatch() APIs as soon as a violation ticket is recorded.
A sample of the data looks like the following and includes information such as time of stop, vehicle type, driver gender, and so forth.
09/30/2014,23:51:00,MCP,"1st district, Rockville",\
DRIVER FAILURE TO STOP AT STEADY CIRCULAR RED SIGNAL,\
PARK RD AT HUNGERFORD DR,,,No,No,No,No,No,No,No,No,No,No,\
MD,02 - Automobile,2014,FORD,MUSTANG,BLACK,Citation,21-202(h1),\
Transportation Article,No,BLACK,M,ROCKVILLE,MD,MD,A - Marked Patrol,
Step 4 Visualize the data from QuickSight
As I continuously push data records to my delivery stream “rayredshiftstream”, I can see these data gets populated to my Redshift table “TrafficViolation” continuously.
Now I’m going to use QuickSight to analyze and visualize the data from my Redshift table “TrafficViolation”. I create a new analysis and a new data set pointing to my Redshift table “TrafficViolation”.
I use “Query” mode to directly retrieve data from my Redshift cluster so that new data is retrieved as they are continuously streamed from Kinesis Firehose.
With a few clicks, I create a bar chart graph that displays number of traffic violations by gender and hour of the day. There are a few interesting patterns: 1) Male drivers have significantly more traffic violations than female drivers during morning hours. 2) Noon has the lowest number of violations. 3) From 2pm to 4pm, the number of violations gap between male and female drivers narrows.
With a live dashboard, this graph will keep updating itself throughout the day as new data continuously gets streamed from police officer’s devices to Redshift through Kinesis Firehose. Another interesting live dashboard to build will be a map graph that shows a heat map of traffic violations across different districts of Maryland over time. I’ll leave this exercise to the readers of this blog and you can use your favorite Business Intelligent tools to do so.
That’s it!
Hopefully through reading this blog and trying it out yourself, you’ve got some inspirations about streaming data and a sense of how easy it is to get started with streaming data analytics on AWS. I cannot wait to see what streaming data analytics pipelines and applications you can build for your organizations!
-Ray Zhu
Amazon QuickSight Now Generally Available – Fast & Easy to Use Business Analytics for Big Data
After a preview period that included participants from over 1,500 AWS customers ranging from startups to global enterprises, I am happy to be able to announce that Amazon QuickSight is now generally available! When I invited you to join the preview last year, I wrote:
In the past, Business Intelligence required an incredible amount of undifferentiated heavy lifting. You had to pay for, set up and run the infrastructure and the software, manage scale (while users fret), and hire consultants at exorbitant rates to model your data. After all that your users were left to struggle with complex user interfaces for data exploration while simultaneously demanding support for their mobile devices. Access to NoSQL and streaming data? Good luck with that!
Amazon QuickSight provides you with very fast, easy to use, cloud-powered business analytics at 1/10th the cost of traditional on-premises solutions. QuickSight lets you get started in minutes. You log in, point to a data source, and begin to visualize your data. Behind the scenes, the SPICE (Super-fast, Parallel, In-Memory Calculation Engine) will run your queries at lightning speed and provide you with highly polished data visualizations.
Deep Dive into Data
Every customer that I speak with wants to get more value from their stored data. They realize that the potential value locked up within the data is growing by the day, but are sometimes disappointed to learn that finding and unlocking that value can be expensive and difficult. On-premises business analytics tools are expensive to license and can place a heavy load on existing infrastructure. Licensing costs and the complexity of the tools can restrict the user base to just a handful of specialists. Taken together, all of these factors have led many organizations to conclude that they are not ready to make the investment in a true business analytics function.
QuickSight is here to change that! It runs as a service and makes business analytics available to organizations of all shapes and sizes. It is fast and easy to use, does not impose a load on your existing infrastructure, and is available for a monthly fee that starts at just $9 per user.
As you’ll see in a moment, QuickSight allows you to work on data that’s stored in many different services and locations. You can get to your Amazon Redshift data warehouse, your Amazon Relational Database Service (RDS) relational databases, or your flat files in S3. You can also use a set of connectors to access data stored in on-premises MySQL, PostgreSQL, and SQL Server databases, Microsoft Excel spreadsheets, Salesforce and other services.
QuickSight is designed to scale with you. You can add more users, more data sources, and more data without having to purchase more long-term licenses or roll more hardware into your data center.
Take the Tour
Let’s take a tour through QuickSight. The administrator for my organization has already invited me to use QuickSight, so I am ready to log in and get started. Here’s the main screen:
I’d like to start by getting some data from a Redshift cluster. I click on Manage data and review my existing data sets:
I don’t see what I am looking for, so I click on New data set and review my options:
I click on Redshift (manual connect) and enter the credentials so that I can access my data warehouse (if I had a Redshift cluster running within my AWS account it would be available as an auto-discovered source):
QuickSight queries the data warehouse and shows me the schemas (sets of tables) and the tables that are available to me. I’ll select the public schema and the all_flights table to get started:
Now I have two options. I can pull the table in to SPICE for quick analysis or I can query it directly. I’ll pull it in to SPICE:
Again, I have two options! I can click on Edit/Preview data and select the rows and columns to import, or I can click on Visualize to import all of the data and proceed to the fun part! I’ll go for Edit/Preview. I can see the fields (on the left), and I can select only those that are interest using the checkboxes:
I can also click on New Filter, select a field from the popup menu, and then create a filter:
Both options (selecting fields and filtering on rows) allow me to control the data that I pull in to SPICE. This allows me to control the data that I want to visualize and also helps me to make more efficient use of memory. Once I am ready to proceed, I click on Prepare data & visualize. At this point the data is loaded in to SPICE and I’m ready to start visualizing it. I simply select a field to get started. For example, I can select the origin_state_abbr field and see how many flights originate in each state:
The miniaturized view on the right gives me some additional context. I can scroll up or down or select the range of values to display. I can also click on a second field to learn more. I’ll click on flights, set the sort order to descending, and scroll to the top. Now I can see how many of the flights in my data originated in each state:
QuickSight’s AutoGraph feature automatically generates an appropriate visualization based on the data selected. For example, if I add the fl_date field, I get a state-by-state line chart over time:
Based on my query, the data types, and properties of the data, QuickSight also proposes alternate visualizations:
I also have my choice of many other visual types including vertical & horizontal bar charts, line charts, pivot tables, tree maps, pie charts, and heat maps:
Once I have created some effective visualizations, I can capture them and use the resulting storyboard to tell a data-driven story:
I can also share my visualizations with my colleagues:
Finally, my visualizations are accessible from my mobile device:
Pricing & SPICE Capacity
QuickSight comes with one free user and 1 GB of SPICE capacity for free, perpetually. This allows every AWS user to analyze their data and to gain business insights at no cost. The Standard Edition of Amazon QuickSight starts at $9 per month and includes 10 GB of SPICE capacity (see the [QuickSight Pricing] page for more info).
It is easy to manage SPICE capacity. I simply click on Manage QuickSight in the menu (I must have the ADMIN role in order to be able to make changes):
Then I can see where I stand:
I can click on Purchase more capacity to do exactly that:
I can also click on Release unused purchased capacity in order to reduce the amount of SPICE capacity that I own:
Get Started Today
Amazon QuickSight is now available in the US East (Northern Virginia), US West (Oregon), and EU (Ireland) regions and you can start using it today.
Despite the length of this blog post I have barely scratched the surface of QuickSight. Given that you can use it at no charge, I would encourage you to sign up, load some of your data, and take QuickSight for a spin!
We have a webinar coming up on January 16th where you can learn even more! Sign up here.
— Jeff;
New – Upload AWS Cost & Usage Reports to Redshift and QuickSight
Many AWS customers have been asking us for a way to programmatically analyze their Cost and Usage Reports (read New – AWS Cost and Usage Reports for Comprehensive and Customizable Reporting for more info). These customers are often using AWS to run multiple lines of business, making use of a wide variety of services, often spread out across multiple regions. Because we provide very detailed billing and cost information, this is a Big Data problem and one that can be easily addressed using AWS services!
While I was on vacation earlier this month, we launched a new feature that allows you to upload your Cost and Usage reports to Amazon Redshift and Amazon QuickSight. Now that I am caught up, I’d like to tell you about this feature.
Upload to Redshift
I started by creating a new Redshift cluster (if you already have a running cluster, you need not create another one). Here’s my cluster:
Next, I verified that I had enabled the Billing Reports feature:
Then I hopped over to the Cost and Billing Reports and clicked on Create report:
Next, I named my report (MyReportRedshift), made it Hourly, and enabled support for both Redshift and QuickSight:
I wrapped things up by selecting my delivery options:
I confirmed my desire to create a report on the next page, and then clicked on Review and Complete. The report was created and I was informed that the first report would arrive in the bucket within 24 hours:
While I was waiting I installed PostgreSQL on my EC2 instance (sudo yum install postgresql94
) and verified that I was signed up for the Amazon QuickSight preview. Also, following the directions in Create an IAM Role, I made a read-only IAM role and captured its ARN:
Back in the Redshift console, I clicked on Manage IAM Roles and associated the ARN with my Redshift cluster:
The next day, I verified that the files were arriving in my bucket as expected, and then returned to the console in order to retrieve a helper file so that I could access Redshift:
I clicked on Redshift file and then copied the SQL command:
I inserted the ARN and the S3 region name into the SQL (I had to add quotes around the region name in order to make the query work as expected):
And then I connected to Redshift using psql
(I can use any visual or CLI-based SQL client):
$ psql -h jbcluster.XYZ.us-east-1.redshift.amazonaws.com \
-U root -p 5439 -d dev
Then I ran the SQL command. It created a pair of tables and imported the billing data from S3.
Querying Data in Redshift
Using some queries supplied by my colleagues as a starting point, I summed up my S3 usage for the month:
And then I looked at my costs on a per-AZ basis:
And on a per-AZ, per-service basis:
Just for fun, I spent some time examining the Redshift Console. I was able to see all of my queries:
Analyzing Data with QuickSight
I also spent some time analyzing the cost and billing data using Amazon QuickSight. I signed in and clicked on Connect to another data source or upload a file:
Then I dug in to my S3 bucket (jbarr-bcm
) and captured the URL of the manifest file (MyReportRedshift-RedshiftManifest.json
):
I selected S3 as my data source and entered the URL:
QuickSight imported the data in a few seconds and the new data source was available. I loaded it into SPICE (QuickSight’s in-memory calculation engine). With three or four more clicks I focused on the per-AZ data, and excluded the data that was not specific to an AZ:
Another click and I switched to a pie chart view:
I also examined the costs on a per-service basis:
As you can see, the new data and the analytical capabilities of QuickSight allow me (and you) to dive deep into your AWS costs in minutes.
Available Now
This new feature is available now and you can start using it today!
— Jeff;
New – AWS Cost and Usage Reports for Comprehensive and Customizable Reporting
Many of our customers have been asking us for data and tools to allow them to better understand and manage their AWS costs.
New Reports
Today we are introducing a set of new AWS Cost and Usage Reports that provide you with comprehensive data about products, pricing, and usage. The reports allow you to understand individual costs and to analyze them in greater detail. For example, you can view your EC2 costs by instance type and then drill-down in order to understand usage by operating system, instance type, and purchase option (On-Demand, Reserved, or Spot).
The new reports are generated in CSV form and can be customized. You can select the data included in each report, decide whether you want it aggregated across an hour or a day, and then request delivery to one of your S3 buckets, with your choice of ZIP or GZIP compression. The data format is normalized so that each discrete cost component is presented in an exclusive column.
You can easily upload the reports to Amazon Redshift and then run queries against the data using business intelligence and data visualization tools including Amazon QuickSight.
Creating a Report
To create a report, head on over to the AWS Management Console, and choose Billing & Cost Management from the menu in the top-right:
Then click on Reports in the left navigation:
Click on Create report to create your first report:
Enter a name for your report, pick a time unit, and decide whether you want to include Resource IDs (more detail and a bigger file) or not:
Now choose your delivery options: pick an S3 bucket (you’ll need to set the permissions per the sample policy), set a prefix if you’d like, and select the desired compression (GZIP or ZIP):
Click on Next, review your choices, and then create your report. It will become visible on the AWS Cost and Usage Reports page:
A fresh report will be delivered to the bucket within 24 hours. Additional reports will be provided every 24 hours (or less) thereafter.
From there you can transfer them to Amazon Redshift using a AWS Data Pipeline job or some code triggered by a AWS Lambda function, and then analyze them using the BI or data visualization tool of your choice.
Visualizing the Data
Here are some sample visualizations, courtesy of Amazon QuickSight. Looking at our EC2 spend by instance type gives an overall picture of our spending:
Viewing it over time shows that spending varies considerably from day to day:
Learn More
To learn more, read about Understanding Your Usage with Billing Reports.
— Jeff;
Learn About the newest AWS Services – Attend our October Webinars
If you attended AWS re:Invent, you were among the first to know about Amazon QuickSight, AWS IoT, Kinesis Firehose, and our other new offerings. Perhaps you had time to attend a session to learn more about the new service or services that were of interest to you. If you didn’t attend re:Invent or missed a session or two and are ready to learn more, I’ve got you covered. We will be running nine new-product webinars later this month. Each webinar is designed to provide you with the information that you need to have in order to be up and running as quickly as possible.
Here’s what we have for you! The webinars are free but “seating” is limited and you should definitely sign up ahead of time if you want to attend (all times are Pacific):
Tuesday, October 27
QuickSight is a fast, cloud-powered business intelligence tool. You can build visualizations, perform ad-hoc analysis, and get business insights from your data.
- Webinar: Introducing Amazon QuickSight (9 – 10 AM).
- Blog Post: Amazon QuickSight – Fast & Easy to Use Business Intelligence for Big Data at 1/10th the Cost of Traditional Solutions.
AWS IoT is a managed cloud platform that lets connected devices easily and securely interact with cloud applications and other devices.
- Webinar: Getting Started with AWS IoT (10:30 – 11:30 AM).
- Blog Post: AWS IoT – Cloud Services for Connected Devices.
Amazon Kinesis Firehose is the easiest way to load streaming data into AWS.
- Webinar: Introducing Amazon Kinesis Firehose (Noon – 1 PM).
- Blog Post: Amazon Kinesis Firehose – Simple & Highly Scalable Data Ingestion.
Wednesday, October 28
Spot Blocks allow you to launch Spot instances that will run for a finite duration (1 to 6 hours).
- Webinar: Introducing Spot Blocks (9 – 10 AM).
- Blog Post: New – EC2 Spot Blocks for Defined-Duration Workloads.
AWS WAF is a web application firewall that helps protect your web applications from common exploits.
- Webinar: Introducing AWS WAF (10:30 – 11:30 AM).
- Blog Post: New AWS WAF.
Amazon Elasticsearch Service is a managed service that makes it easy to deploy, operate, and scale Elasticsearch in the AWS Cloud.
- Webinar: Introducing Amazon Elasticsearch Service (Noon – 1 PM).
- Blog Post: New – Amazon Elasticsearch Service.
Thursday, October 29
AWS Lambda lets you run code in the cloud without provisioning or managing servers.
- Webinar: AWS Lambda Best Practices: Python, Scheduled Jobs, and More (9 – 10 AM).
- Blog Post: AWS Lambda Update – Python, VPC, Increased Function Duration, Scheduling, and More.
AWS Mobile Hub provides an integrated console that helps you build, test, and monitor your mobile apps.
- Webinar: Introducing AWS Mobile Hub (10:30 – 11:30 AM).
- Blog Post: AWS Mobile Hub – Build, Test, and Monitor Mobile Applications.
AWS Import/Export Snowball is a petabyte-scale data transport solution that uses secure appliances to transfer large amounts of data into and out of AWS.
- Webinar: Introducing AWS Import/Export Snowball (Noon – 1 PM).
- Blog Post: AWS Import/Export Snowball – Transfer 1 Petabyte Per Week Using Amazon-Owned Storage Appliances.
— Jeff;