Learn about Astra, the new DataStax DBaaS offering for Cassandra.
All Other Authors: Radu Urziceanu, Sam Jacob, Jesus Moreno, and Expero Staff
Like many DataStax users, we were very excited to learn about Astra, their new DBaaS offering for Cassandra. At Expero, we’ve built several visually complex applications on top of DataStax Enterprise and we were eager to see how we could leverage Astra for our projects.
Our customers are in the business of building applications or services that solve some business need - server cluster installation and configuration effort is simply a stepping stone in our project. If we can crank out the resulting application without having to do all server administration, we can focus our attention where it needs to be - on our client’s application.
Astra promises to minimize all that effort and we needed to take it out for spin.
We decided to port one of our supply chain demonstrator applications that allows supply chain planners to inspect portions of their supply chain and identify potential bottlenecks. We chose this app because it required us to explore a couple of different capabilities of the database, namely its ability to support various summarization scenarios and also its ability to manage time series data. Cassandra is ideal for time series data natively, even allowing for an explicit compaction strategy and also supports many slicing and dicing capabilities if you model the data cleverly.
A little bit about the application we opted to port: our supply chain demo is a React web application that typically is run from a protected S3 bucket served up from an Expero specific domain. The javascript that executes in the browser performs a mixture of GraphQL and direct client calls to a DataStax Enterprise Server also running inside that VPC.
Upon logging in, you have the option to create a new database. If you are at all familiar with Cassandra features and cloud paradigms, many of the settings should be straightforward.
You have the option to use either AWS or GCP as your hosting provider and select from a list of geographical regions depending on your provider. Furthermore, you have a free introductory service level which allows you to store up to 10GB - perfect for our purposes, so we opted for the free tier.
If you’ve opted for the free tier, then you start with 1 Capacity Unit (CU). Afterwards, you can add more CUs via expansion if your application requires. They’ve included a helpful cost estimator - as I’ve learned the hard way after leaving a few m4.4xlarges running accidentally for several days.
Next you name your database and keyspace and provide access credentials.
Fire it up and in just a few minutes, you’ve now provisioned a new instance and you’re ready to start modeling and building your database.
After the instance has started we see a list similar to the image below. The actions available behind the ellipses are self explanatory. For starters, we reviewed the Connection Details and download the secure connect bundle - turns out we’ll need it later to load the data and connect the API.
Now that the database is up and running and we’ve downloaded the secure connect bundle, we can fire up the DataStax Developer Studio. This tool is bundled with DSE Enterprise and should be familiar to DataStax customers already. It follows the notebook paradigm and comes with a Getting Started guide that’s well worth walking through.
So we created a new Notebook called SupplyChainDemo, which will contain all our DDL scripts.
In the first notebook cell, we added all the table creation commands. I added a second cell to all drop tables. This allowed me to run and rerun the entire database creation and teardown as I modeled the database.
After the database was created, it was time to load all the data from our prior demo site. Astra leverages the existing database loader, dsbulk, which supports loading delimited text files.
DSbulk is a mature and well documented tool so I won’t go into the feature-by-feature details here but rather call out a few items that were important or material. Interestingly, you need to pass both the username/password pair as well as the secure connection bundle on the command line - unlike for instance AWS ssh keys which suffice for secure remote access.
The loader gives you terrific feedback on whether the file loaded or not - and each run of dsbulk creates its own time-stamped directory with informative files on what did and did not go well.
Helpfully, when you have something go wrong, the loader creates files to help you troubleshoot.
In my case, some of my floating point columns needed to be recreated using the C* data type double rather than float.
After confirming my data was there in DSE Studio, it’s now time to hook up our web application.
We were particularly interested in trying out the new Astra facility to query the database directly with REST queries. To the extent this feature would satisfy our needs, we would not need to build out a middleware tier and at first blush it was going to be our preference.
Much in the same way that dsbulk relied on the secure download bundle, you originally need to request a authorization key to be used for subsequent requests. After swapping out our specific hosting details for database id ad region, we got back a token for subsequent uses :
curl --request POST \
--url https://{databaseid}-{region}.apps.astra.datastax.com/api/rest/v1/auth \
--header 'accept: */*' \
--header 'content-type: application/json' \
--header 'x-cassandra-request-id: {unique-UUID}' \
--data '{"username":"{database-user}","password":"{database-password}"}'
The above request gives you back a UUID that would get plugged into this REST call (note that details of your data model such as keyspaces and table names get plugged directly into the URL):
curl --request POST \
--url https://{databaseid}-{region}.apps.astra.datastax.com/api/rest/v1/keyspaces/{my_keyspace}/tables/{table_name}/rows{primaryKey} \
--header 'accept: application/json' \
--header 'content-type: application/json' \
--header 'x-cassandra-request-id: {unique-UUID}' \
--header 'x-cassandra-token: {auth-token}' \
This REST call returns the JSON payload for the given primary key. Unfortunately, in our application, we needed to query some groups of products - which we normally do with some queries along partition keys. This is not yet supported in the REST interface so we needed some way of supporting broader, non-primary key queries....so it looks like we will need a middle tier for the time being.
Astra provides you several options for connecting to your database via API: C++, C#, java, node.js and python. We chose java for expediency as were coding a bunch of SpringBoot endpoints already and we are familiar with it. This is a very mature part of the Astra offering - I’d suspect the other drivers to work as seamlessly as java. For reference, we used SpringBoot framework 2.1.13 and the DataStax java driver from here.
Again the java API needs the secure bundle zip as part of the database connection. For those of you who have used DSE or C* before, you’ll see that the actual querying code is no different - it’s just establishing the connection is slightly different (and simpler) :
public class AstraCDBConnection {
@Value("${astra.cassandra.user}")
private String user;
@Value("${astra.cassandra.password}")
private String password;
@Value("${astra.cassandra.keyspace}")
private String keyspace;
@Value("${astra.cassandra.securepath}")
private String securepath;
@Bean
public CqlSession getDbSession(){
//if run on windows pass secure zip file path as a system prop
//on mac or linux secure path = /etc/astra/secure-connect-test.zip or
//use system prop to set a path
if ( System.getProperty("filepath") != null) {
securepath = System.getProperty("filepath");
}
final CqlSession session = CqlSession.builder()
.withCloudSecureConnectBundle(Paths.get(securepath))
.withAuthCredentials(user,password)
.withKeyspace(keyspace)
.build();
return session;
}
}
There were no real surprises here, normal API programming converting REST calls into API calls using a controller class to parsing url for parameters and sending it to the database.
@RequestMapping("/getFinishedGoodsAll")
@ResponseBody
public String getFinishedGoodsAll() {
CqlSession cqls = dbConnection.getDbSession();
ResultSet rs = cqls.execute("select * from finished_good");
List<FinishedGoods> fgList = new ArrayList<>();
Iterator<Row> it = rs.iterator();
try {
while (it.hasNext()) {
Row row = it.next();
FinishedGoods fg = new FinishedGoods();
fg.setId(row.getString("id"));
fg.setName(row.getString("name"));
fg.setUnitPrice(row.getDouble("retail_price"));
fg.setCurrency(row.getString("currency"));
fg.set__typename("FinishedGood");
fgList.add(fg);
}
return objectMapper.writeValueAsString(fgList);
}
catch (Exception ex){
System.out.println(ex.getMessage());
}
return "No Results Returned";
}
One item to note, to prevent CORS errors, we enabled the SpringBoot setting for cross origin support in the main SpringBoot controller:
@CrossOrigin(origins = "*", maxAge = 3600)
@Controller
public class AstraController {
@Autowired
private AstraCDBConnection dbConnection;
private static final ObjectMapper objectMapper = new ObjectMapper();
Now we’ve got the REST endpoints swapped out with our new Spring Boot ones, our database modeled and loaded in Astra and voila! It works like a charm. You can see the Studio assets, Spring Boot code and data loading scripts in the github repo here.
Final Thoughts
Our porting exercise gave us a terrific perspective on the Astra product as it stands in beta currently and confidence to use it when it moves to GA. Our main takeaways were:
So give it a whirl! Free tiers are available to allow you to try it out. Let us know what you think of this blog. Thanks!
Tell us what you need and one of our experts will get back to you.