360Science for Spark - Sample Applications

The sample applications demonstrate deduplication of CSV data in the form of JavaRDD and of database tables loaded via Jdbc into Datasets.

Each sample application folder contains:

src Folder containing the sample source code.
build.sh Script to build the application using maven and the pom.xml file.
<app>-jar-with-dependencies.jar Pre-built executable jar.
pom.xml Maven build configuration.
readme Text file with overview of application
run.sh Example script to run the application.
sampleconfig.xml Example configuration file.

Additionally, the DedupeTextFile contains an example1.txt input file.

You don’t need to build the sample apps, as pre-built binaries are included, but build scripts are also included in case you want to modify the source to tailor the applications.

Was this article helpful?
0 out of 0 found this helpful

have a question or not finding what you're looking for?

Submit a ticket to get some help