如何运行含spark的python脚本

2024-12-04 09:27:23
推荐回答(1个)
回答1:

~spark$ bin/spark-submit first.py
-----------first.py-------------------------------
from pyspark import SparkConf, SparkContext
conf = SparkConf().setMaster("local").setAppName("My App")
sc = SparkContext(conf = conf)
lines = sc.textFile("first.py")
pythonLines = lines.filter(lambda line: "Python" in line)
print "hello python"
print pythonLines.first()
print pythonLines.first()
print "hello spark!"
---------------------------------------------------
hello python
pythonLines = lines.filter(lambda line: "Python" in line)
pythonLines = lines.filter(lambda line: "Python" in line)
hello spark!

到spark的安装目录下/bin 下面 spark-submit ***.py 即可