Microsoft brings .NET dev to Apache Spark
Microsoft and the .NET Foundation have released rendering 1.0 of .NET for Apache Spark _ an open rise package that brings .NET outgrowth to the Spark analytics engine for large-scale data processing.
Announced October 27_ .NET for Apache Spark 1.0 has support for .NET applications targeting .NET Standard 2.0 or later. Users can approach Spark DataFrame APIs_ write Spark SQL_ and form user-defined functions UDFs.
The .NET for Apache Spark framework is useful on the .NET Foundations GitHub page or from NuGet . Other capabilities of .NET for Apache Spark 1.0 include:
An API extension framework to add support for additional Spark libraries including Linux Foundation Delta Lake_ Microsoft OSS Hyperspace_ ML.NET_ and Apache Spark MLlib functionality.<_li>
.NET for Apache Spark programs that are not UDFs show the same despatch as Scala and PySpark-based non-UDF applications. If applications include UDFs_ .NET for Apache Spark programs are at smallest as fast as PySpark programs or might be faster.<_li>
.NET for Apache Spark is built into Azure Synapse and Azure HDInsight. It also can be used in other Apache Spark cloud offerings including Azure Databricks.<_li>
<_ul>The leading open rendering of the project was announced in April 2019. Driving the outgrowth of .NET for Apache Spark was increased claim for an easier way to build big data applications instead of having to acquire Scala or Python. The project is operated below the .NET Foundation and has been filed as a Spark Project Improvement Proposal to be considered for inclusion in the Apache Spark project straightly.
Looking forward_ Microsoft is addressing obstacles including setting up prerequisites and dependencies and finding condition documentation_ with examples such as community-contributed “ready-to-run” Docker images and updates to .NET for Apache Spark documentation. Another priority is supporting deployment options including integration with CI_CD devops pipelines and publishing jobs straightly from Visual Studio.