Institutional data analytics

Utilizing Amazon S3 to Support a Data Infrastructure

(This blog is part two of a series on how HelioCampus uses AWS to support our data analytics platform)

Today’s data processes can generate massive amounts of data at various points in the data flow. Organizing this data can be a challenge without having an easy way to store it in the cloud. Amazon’s Simple Storage Service (Amazon S3) provides organizations of different sizes the ability to store data in a cost efficient, secure, scalable way, without compromising on the performance of data retrieval.

Having Amazon S3 for data storage allows us to build a data lake to retrieve and store copies of data, develop efficient processes to load our data warehouse, and have the ability to archive frozen copies of data. It even allows us to provide our internal teams and our customers with an easy and secure way to load their own data from any location, in order to overlay it with our data models.

Amazon S3 stores data as objects, instead of being stored as bits and bytes. An object is a file along with any optional metadata describing the file. These objects are stored within a resource called a “bucket,” sort of like a folder in the cloud. You can decide which region AWS S3 stores your bucket, and provision who gets access to your bucket. Access can be provided to users, groups or roles, who can access them either using long-term access keys, or temporary security credentials (more on this later). Having an object’s metadata, along with buckets, allows for various critical capabilities such as version control of objects, tagging objects for cost allocation, choosing encryption levels, controlling and logging access, and hosting static websites.

The cost of storing data in S3 depends on how much of data you have and how frequently you want to access it. Depending on performance, and how much you want to spend, there are 6 available classes of storage in S3. The more data you access, and the more frequently you access it, the more it will cost. For a small-tiering fee, you can even let AWS’ S3 Intelligent-Tiering automatically choose an optimal access tier for your objects by analyzing your access patterns.

This is just one of the AWS tools that HelioCampus uses to support data analytics. Continue to check our blog periodically as we will be sharing more posts on additional tools in the future.

Tag(s): Institutional data analytics

Up next...

Check out these blogs for ideas and best practices to enhance your data analytics, financial intelligence, or assessment efforts.

Academic Portfolio Evaluation

How to move on from simple counting to measure program efficiency

As I see the regular drum beat of program closures continue to make headlines, I wonder if decision makers have the right tools at their disposal to effectively measure...

Institutional Data Analytics

The Hallmarks of Data-Driven Success: What Strong Data Governance Looks Like

Data is an organization’s most powerful asset—but without proper governance, it can quickly become a liability. Whether it’s conflicting metrics, inconsistent processes, or...

Institutional Data Analytics

'Humans in the Loop:' The Defined Roles of Data Governance in Higher Ed

In our previous post, we explored why Data Governance matters in higher education through the challenges of poor data management, highlighting the costly errors, missed...

Utilizing Amazon S3 to Support a Data Infrastructure

Up next...

How to move on from simple counting to measure program efficiency

The Hallmarks of Data-Driven Success: What Strong Data Governance Looks Like

'Humans in the Loop:' The Defined Roles of Data Governance in Higher Ed

Product Suite

Solutions for

Learn

Company

Utilizing Amazon S3 to Support a Data Infrastructure

Share

Up next...

How to move on from simple counting to measure program efficiency

The Hallmarks of Data-Driven Success: What Strong Data Governance Looks Like

'Humans in the Loop:' The Defined Roles of Data Governance in Higher Ed

Product Suite

Solutions for

Learn

Company