Building Large-Scale End-to-End Recommendation Systems with BigDL Friesian

Overview

BigDL Friesian is an application framework for building optimized large-scale recommender solutions optimized on Intel Xeon. This workflow demonstrates how to use Friesian to easily build an end-to-end Wide & Deep Learning recommennder system on a real-world large dataset provided by Twitter.

How it Works

Friesian provides various built-in distributed feature engineering operations and the distributed training of popular recommendation algorithms based on BigDL Orca and Spark.
Friesian provides a complete, highly available and scalable pipeline for online serving (including recall and ranking) as well as nearline updates based on gRPC services.

The overall architecture of Friesian is shown in the following diagram:

Get Started

Dataset Preparation

You can download Twitter Recsys Challenge 2021 dataset from here. Or you can run the script generate_dummy_data.py to generate a dummy dataset.

To run on a Kubernetes cluster, you may need to put the downloaded data to a shared volume. Please refer to here for more details.

Docker

Please refer to here for the docker image for BigDL on K8s.
Please refer to here to create a client container for the Kubernetes cluster.

Environment Preparation

Please follow the steps here to prepare the Python environment on the client container.

How to run

Please refer to here to run the distributed feature engineering and training workload on a Kubernetes cluster. The scripts are here.
Please refer to here to run the online serving workload.

Recommended Hardware

The hardware below is recommended for use with this reference implementation.

Intel® 4th Gen Xeon® Scalable Performance processors

Learn More

Please check the notebooks here for more detailed descriptions for distributed feature engineering and training.
Please check here for more reference use cases.
Please check here for more detailed API documentations.

Known Issues

NA

Troubleshooting

NA

Support Forum

Please submit issues here and we will track and respond to them daily.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DEVCATALOG.md

DEVCATALOG.md

Building Large-Scale End-to-End Recommendation Systems with BigDL Friesian

Overview

How it Works

Get Started

Dataset Preparation

Docker

Environment Preparation

How to run

Recommended Hardware

Learn More

Known Issues

Troubleshooting

Support Forum

Files

DEVCATALOG.md

Latest commit

History

DEVCATALOG.md

File metadata and controls

Building Large-Scale End-to-End Recommendation Systems with BigDL Friesian

Overview

How it Works

Get Started

Dataset Preparation

Docker

Environment Preparation

How to run

Recommended Hardware

Learn More

Known Issues

Troubleshooting

Support Forum