Introducing .NET for Apache Spark, 1st ed. Distributed Processing for Massive Datasets
Auteur : Elliott Ed
- Install and configure Spark .NET on Windows, Linux, and macOS
- Write Apache Spark programs in C# and F# using the .NET bindings
- Access and invoke the Apache Spark APIs from .NET with the same high performance as Python, Scala, and R
- Encapsulate functionality in user-defined functions
- Transform and aggregate large datasets
- Execute SQL queries against files through Apache Hive
- Distribute processing of large datasets across multiple servers
- Create your own batch, streaming, and machine learning programs
Part I. Getting Started.- 1. Understanding Apache Spark.- 2. Setting up Spark.- 3.- Programming with .NET for Apache Spark.- Part II. The APIs.- 4. User-Defined Functions.- 5. The DataFrame API.- 6. Spark SQL and Hive Tables.- 7. Spark Machine Learning API.- Part III. Examples.- 8. Batch Mode Processing.- 9. Structured Streaming.- 10. Troubleshooting.- 11. Delta Lake.- Part IV. Appendices.- Appendix A. Running in the Cloud.- Appendix B. Implementing .NET for Apache Spark Code.
Date de parution : 04-2021
Ouvrage de 262 p.
17.8x25.4 cm