﻿<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>大数据应用技术 &#8211; 学术创新中心</title>
	<atom:link href="https://www.leexinghai.com/aic/category/gcc/2stls/bdat/feed/" rel="self" type="application/rss+xml" />
	<link>https://www.leexinghai.com/aic</link>
	<description>Academic Innovation Center</description>
	<lastBuildDate>Sat, 02 Aug 2025 11:06:32 +0000</lastBuildDate>
	<language>zh-Hans</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>

<image>
	<url>https://www.leexinghai.com/aic/wp-content/uploads/2025/08/cropped-徽标名称-32x32.jpg</url>
	<title>大数据应用技术 &#8211; 学术创新中心</title>
	<link>https://www.leexinghai.com/aic</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>H30507-实验8-Spark配置教程和编程实践</title>
		<link>https://www.leexinghai.com/aic/sparkinst8/</link>
		
		<dc:creator><![CDATA[李星海]]></dc:creator>
		<pubDate>Sun, 07 May 2023 10:25:30 +0000</pubDate>
				<category><![CDATA[大数据应用技术]]></category>
		<guid isPermaLink="false">https://aic.leexinghai.com/?p=2302</guid>

					<description><![CDATA[本文分两个部分，第一个部分为Spark配置教程（因为XMU的配置太老了，这里换了版本，在这里简单记录一下，同时 [&#8230;]]]></description>
										<content:encoded><![CDATA[
<p>本文分两个部分，第一个部分为Spark配置教程（因为XMU的配置太老了，这里换了版本，在这里简单记录一下，同时也解决了老师上课时提出的解决警告的问题）；第二个部分为广州商学院大数据应用技术课程实验8报告参考教程内容。</p>



<h2 class="wp-block-heading">第一部分：Spark配置教程</h2>



<p>软件版本：Hadoop：3.1.3（使用本人制作的虚拟机则版本相同） | Spark：3.0.0（<mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-vivid-red-color"><strong>不要使用XMU提供的2.4.0</strong></mark>）| Linux：Ubuntu23.04（原版16那个应该也行）</p>



<p>下载地址传送：<a href="https://archive.apache.org/dist/spark/spark-3.0.0/">Index of /dist/spark/spark-3.0.0 (apache.org)</a></p>



<p>1.在上面的地址中下载Spark3.0.0-bin-without-hadoop.tgz到家目录下，使用tar命令解压到/usr/local文件夹下；然后更改名字为spark；为hadoop用户增加权限；将配置模板文件改名删除【.template】字段，使其成为正式配置文件，编辑配置文件，配置命令如图1所示。</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img fetchpriority="high" decoding="async" width="865" height="217" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-13.png" alt="" class="wp-image-2303" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-13.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-13-300x75.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-13-768x193.png 768w" sizes="(max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图1</figcaption></figure>
</div>


<p>2.编辑配置文件：在空行增加如图2所示的内容。</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img decoding="async" width="865" height="187" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-14.png" alt="" class="wp-image-2304" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-14.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-14-300x65.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-14-768x166.png 768w" sizes="(max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图2</figcaption></figure>
</div>


<p>3.此时可以通过spark目录下的bin/spark-shell直接启动spark，这里会出现两个警告：分别是（1）主机名被解析成127.0.1.1，建议通过虚拟机的静态IP地址进行替换；（2）不能加载本地hadoop库，在可用前使用builtin-java类来替换，警告内容如图3所示。</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img decoding="async" width="865" height="276" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-15.png" alt="" class="wp-image-2305" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-15.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-15-300x96.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-15-768x245.png 768w" sizes="(max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图3</figcaption></figure>
</div>


<p>4.对于警告1，可以在spark-env.sh（图1紫色部分命令处），增加如图4所示的字段，通过指定SPARK_LOCAL_IP=您虚拟机的静态IP地址来解决。</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="865" height="149" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-16.png" alt="" class="wp-image-2306" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-16.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-16-300x52.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-16-768x132.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图4</figcaption></figure>
</div>


<p>5.对于警告2，可以通过<code>vim /etc/profile </code>，增加如图5所示的字段，通过指定hadoop库文件路径来解决。</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="865" height="289" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-17.png" alt="" class="wp-image-2307" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-17.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-17-300x100.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-17-768x257.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图5</figcaption></figure>
</div>


<p>6.完成第5步后，使用<code>source /etc/profile</code>进行文件编译，使其配置生效。此时重新运行 <code>spark/bin/spark-shell</code>，重新启动spark shell，可以发现警告已经被解决，如图6所示。</p>


<div class="wp-block-image">
<figure class="aligncenter size-full is-resized"><img loading="lazy" decoding="async" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-18.png" alt="" class="wp-image-2308" style="width:651px;height:282px" width="651" height="282" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-18.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-18-300x130.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-18-768x334.png 768w" sizes="auto, (max-width: 651px) 100vw, 651px" /><figcaption class="wp-element-caption">图6</figcaption></figure>
</div>


<p>7.安装sbt：先安装curl组件：<code>sudo apt install curl </code>然后复制如下命令：</p>



<pre class="wp-block-code"><code>echo "deb https://repo.scala-sbt.org/scalasbt/debian all main" | sudo tee /etc/apt/sources.list.d/sbt.list

echo "deb https://repo.scala-sbt.org/scalasbt/debian /" | sudo tee /etc/apt/sources.list.d/sbt_old.list

curl -sL "https://keyserver.ubuntu.com/pks/lookup?op=get&amp;search=0x2EE0EA64E40A89B84B2DF73499E82A75642AC823" | sudo apt-key add

sudo apt-get update

sudo apt-get install sbt
</code></pre>



<p>如图7所示。</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="865" height="194" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-19.png" alt="" class="wp-image-2309" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-19.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-19-300x67.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-19-768x172.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图7</figcaption></figure>
</div>


<p>8.完成安装后，输入sbt，会进行sbt的配置，例如下载sbt启动器等，如图8所示；完成下载的结果如图9所示。</p>


<div class="wp-block-image">
<figure class="aligncenter size-full is-resized"><img loading="lazy" decoding="async" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-20.png" alt="" class="wp-image-2310" style="width:840px;height:94px" width="840" height="94" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-20.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-20-300x34.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-20-768x86.png 768w" sizes="auto, (max-width: 840px) 100vw, 840px" /><figcaption class="wp-element-caption">图8</figcaption></figure>
</div>

<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="865" height="581" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-21.png" alt="" class="wp-image-2311" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-21.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-21-300x202.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-21-768x516.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图9</figcaption></figure>
</div>


<p>9.在spark目录下创建简单文件夹，以便后续使用，创建命令和文件结构如图10所示。</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="865" height="191" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-22.png" alt="" class="wp-image-2312" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-22.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-22-300x66.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-22-768x170.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图10</figcaption></figure>
</div>


<p>10.在创建好的ch10_shell下创建一个hw1.txt文件，随意输入一些内容用于统计文件行数，参考效果如图11，图12所示。</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="865" height="467" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-23.png" alt="" class="wp-image-2313" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-23.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-23-300x162.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-23-768x415.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图11</figcaption></figure>
</div>

<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="865" height="405" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-24.png" alt="" class="wp-image-2314" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-24.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-24-300x140.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-24-768x360.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图12</figcaption></figure>
</div>


<p>11.在spark交互界面使用命令进行本地操作，通过命令<code> val textFile=sc.textFile("file://虚拟机本地文件路径") </code>创建一个变量textFile，然后使用<code>textFile.count()</code>进行文本行数统计，结果如图13所示。</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="865" height="144" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-25.png" alt="" class="wp-image-2315" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-25.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-25-300x50.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-25-768x128.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图13</figcaption></figure>
</div>


<p>12.使用 <code>/usr/local/hadoop/sbin/start-all.sh</code>启动hadoop。然后通过mkdir命令和put命令创建一些文件夹，将刚刚创建好的hw1.txt文件上传到hdfs中，命令如图14所示。</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="865" height="211" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-26.png" alt="" class="wp-image-2316" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-26.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-26-300x73.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-26-768x187.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图14</figcaption></figure>
</div>


<p>13.在spark交互界面使用命令进行hdfs操作，通过 <code>val 自己随便定义变量名=sc.textFile("hdfs路径")</code> <code>自己随便定义变量名.count() </code>可以访问hdfs的文件并进行相应操作，命令如图15，图16所示。</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="865" height="161" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-27.png" alt="" class="wp-image-2317" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-27.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-27-300x56.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-27-768x143.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图15</figcaption></figure>
</div>

<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="865" height="223" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-28.png" alt="" class="wp-image-2318" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-28.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-28-300x77.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-28-768x198.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图16</figcaption></figure>
</div>


<p>[2023-10-03更新]14.如果hdfs读取报错,则需要手动添加hdfs的路径,如图16-1所示,如果hdfs路径被修改过,则需要在hdfs根文件夹前添加<strong>hdfs://IP地址:端口号 </strong>来解决(具体可以参见您HADOOP的core-site.xml文件).</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img loading="lazy" decoding="async" width="1024" height="386" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/10/FAD813EC6414242E755A0060074D32B9-1024x386.png" alt="" class="wp-image-2903" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/10/FAD813EC6414242E755A0060074D32B9-1024x386.png 1024w, https://www.leexinghai.com/aic/wp-content/uploads/2023/10/FAD813EC6414242E755A0060074D32B9-300x113.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/10/FAD813EC6414242E755A0060074D32B9-768x289.png 768w, https://www.leexinghai.com/aic/wp-content/uploads/2023/10/FAD813EC6414242E755A0060074D32B9.png 1489w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /><figcaption class="wp-element-caption">图16-1</figcaption></figure>
</div>


<h2 class="wp-block-heading">第二部分：广州商学院大数据应用技术课程实验8报告参考教程内容</h2>



<p><strong>一、实验目的</strong><strong></strong></p>



<ol class="wp-block-list" type="a">
<li>掌握使用Spark访问本地文件和HDFS文件的方法；</li>



<li>掌握Spark应用程序的编写、编译和运行方法。</li>
</ol>



<p><strong>二、实验仪器设备或材料</strong></p>



<ul class="wp-block-list" type="a">
<li>JDK 1.8</li>



<li>Eclipse 2019-12(R)</li>



<li>Hadoop 3.3.1(本例使用3.1.3)</li>



<li>Spark 3.2.0(本例使用3.0.0)</li>



<li>Sbt 1.5.5(本例使用1.8.2)</li>



<li>Maven 3.8.3（本例未使用）</li>
</ul>



<p><strong>三、实验原理</strong><strong></strong></p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="627" height="289" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-30.png" alt="" class="wp-image-2323" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-30.png 627w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-30-300x138.png 300w" sizes="auto, (max-width: 627px) 100vw, 627px" /></figure>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="461" height="492" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-31.png" alt="" class="wp-image-2324" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-31.png 461w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-31-281x300.png 281w" sizes="auto, (max-width: 461px) 100vw, 461px" /></figure>



<p><strong>四、实验内容与步骤</strong><strong></strong></p>



<ol class="wp-block-list" type="1">
<li><strong>Spark读取文件系统的数据</strong></li>



<li>在spark-shell中读取Linux系统本地文件“/home/hadoop/test.txt”，然后统计出文件的行数；</li>



<li>在spark-shell中读取HDFS系统文件“/user/hadoop/test.txt”（如果该文件不存在，请先创建），然后，统计出文件的行数；</li>



<li><strong>编写独立应用程序（推荐使用Scala语言），读取HDFS系统文件“/user/hadoop/test.txt”（如果该文件不存在，请先创建），然后，统计出文件的行数；通过sbt工具将整个应用程序编译打包成 JAR包，并将生成的JAR包通过 spark-submit 提交到 Spark 中运行命令。</strong></li>



<li><strong>编写独立应用程序实现数据去重:对于两个输入文件A和B，编写Spark独立应用程序（推荐使用Scala语言），对两个文件进行合并，并剔除其中重复的内容，得到一个新文件C。</strong></li>



<li><strong>编写独立应用程序实现求平均值问题:每个输入文件表示班级学生某个学科的成绩，每行内容由两个字段组成，第一个是学生名字，第二个是学生的成绩；编写Spark独立应用程序求出所有学生的平均成绩，并输出到一个新文件中。</strong></li>
</ol>



<p><strong>五、实验结果与分析</strong><strong></strong></p>



<ol class="wp-block-list" type="1">
<li>在spark-shell中读取Linux系统本地文件“/home/hadoop/test.txt”，然后统计出文件的行数；</li>
</ol>



<p>1.1在任意地方创建test.txt(本例名称为hw1.txt),并任意输入内容。参考命令及内容如图1所示。</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="865" height="405" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-32.png" alt="" class="wp-image-2326" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-32.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-32-300x140.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-32-768x360.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图1</figcaption></figure>
</div>


<p>1.2使用count()属性对本地文件进行行数统计，结果如图2所示。</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="865" height="144" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-33.png" alt="" class="wp-image-2327" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-33.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-33-300x50.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-33-768x128.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图2</figcaption></figure>
</div>


<p>2.在spark-shell中读取HDFS系统文件“/user/hadoop/test.txt”（如果该文件不存在，请先创建），然后，统计出文件的行数；</p>



<p>2.1在hdfs上创建一个ch10文件夹；在其子目录下创建input_shell文件夹；使用-put命令将第1步的文件上传，命令如图3所示。</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="894" height="218" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-34.png" alt="" class="wp-image-2328" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-34.png 894w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-34-300x73.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-34-768x187.png 768w" sizes="auto, (max-width: 894px) 100vw, 894px" /><figcaption class="wp-element-caption">图3</figcaption></figure>
</div>


<p>2.2使用count()属性对hdfs的文件进行行数统计，结果如图4所示。</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="865" height="161" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-35.png" alt="" class="wp-image-2329" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-35.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-35-300x56.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-35-768x143.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图4</figcaption></figure>
</div>


<h4 class="wp-block-heading">3.编写独立应用程序（推荐使用Scala语言），读取HDFS系统文件“/user/hadoop/test.txt”（如果该文件不存在，请先创建），然后，统计出文件的行数；通过sbt工具将整个应用程序编译打包成 JAR包，并将生成的JAR包通过 spark-submit 提交到 Spark 中运行命令。</h4>



<p>3.1在spark文件夹下创建sparkapps/ch10_scala/src/main/scala 系列文件夹，然后到scala文件夹下，创建HW10_1_CountLine.scala文件，在ch10_scala文件夹下创建HW10_1_CountLine.sbt文件，命令如图5所示。</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="865" height="53" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-36.png" alt="" class="wp-image-2330" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-36.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-36-300x18.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-36-768x47.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图5</figcaption></figure>
</div>


<p>3.2对于HW10_1_CountLine.scala文件，内容如图6所示，对于HW10_1_CountLine.sbt文件，内容如图7所示；需要注意的是：hdfs路径就是hadoop/etc/hadoop/core-site.xml里面的hdfs文件路径（本例及本人设计的虚拟机为192.168.245.5:9810，同时sbt文件中的scala-version和spark-core版本为2.12.10与3.0.0，您如果使用您自己配置的虚拟机，则改成对应您虚拟机中的版本。下不再赘述）</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="865" height="410" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-37.png" alt="" class="wp-image-2331" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-37.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-37-300x142.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-37-768x364.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图6</figcaption></figure>
</div>

<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="865" height="194" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-38.png" alt="" class="wp-image-2332" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-38.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-38-300x67.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-38-768x172.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图7</figcaption></figure>
</div>


<p>3.3编辑完成后，使用<code>sbt package</code>进行打包。第一次可能需要很久才能下载完成（如果您不在中国大陆则会非常快），如图8所示（下载了4118秒，即为1小时8分钟38秒）；如果遇到AccessDenied报错，则请将sparkapps文件夹的权限设置为<strong>777</strong>.</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="865" height="520" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-39.png" alt="" class="wp-image-2333" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-39.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-39-300x180.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-39-768x462.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图8</figcaption></figure>
</div>


<p>3.4打包完成后，会在target/scala-2.12文件夹看到一个jar文件，如图9所示。</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="865" height="75" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-40.png" alt="" class="wp-image-2334" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-40.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-40-300x26.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-40-768x67.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图9</figcaption></figure>
</div>


<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 3.5使用<code>spark-submit</code>(未配置环境变量则在spark的bin目录下)，进行hdfs文件行数统计，可以获得输出结果。参考命令及结果如图10所示。</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="865" height="59" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-41.png" alt="" class="wp-image-2335" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-41.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-41-300x20.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-41-768x52.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图10</figcaption></figure>
</div>


<h4 class="wp-block-heading">4.编写独立应用程序实现数据去重</h4>



<h5 class="wp-block-heading">对于两个输入文件A和B，编写Spark独立应用程序（推荐使用Scala语言），对两个文件进行合并，并剔除其中重复的内容，得到一个新文件C.</h5>



<p>4.1在scala文件夹下，创建HW10_2_Merge.scala文件，在ch10_scala文件夹下创建HW10_2_Merge.sbt文件，创建文件的命令可参考图5；对于HW10_2_Merge.scala文件，内容如图11所示，对于HW10_2_Merge.sbt文件，内容如图12所示。</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="865" height="453" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-42.png" alt="" class="wp-image-2336" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-42.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-42-300x157.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-42-768x402.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图11</figcaption></figure>
</div>

<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="807" height="201" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-43.png" alt="" class="wp-image-2337" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-43.png 807w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-43-300x75.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-43-768x191.png 768w" sizes="auto, (max-width: 807px) 100vw, 807px" /><figcaption class="wp-element-caption">图12</figcaption></figure>
</div>


<p>4.2删除ch10_scala下的target和project文件夹，删除后的文件夹内容如图13所示。</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="660" height="297" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-44.png" alt="" class="wp-image-2338" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-44.png 660w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-44-300x135.png 300w" sizes="auto, (max-width: 660px) 100vw, 660px" /><figcaption class="wp-element-caption">图13</figcaption></figure>
</div>


<p>4.3在hdfs上将实验6的两个数据源复制到ch10/input_merge文件夹下，结果如图14所示。</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="865" height="170" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-45.png" alt="" class="wp-image-2339" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-45.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-45-300x59.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-45-768x151.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图14</figcaption></figure>
</div>


<p>4.4使用<code>sbt package</code>命令打包后，使用<code>spark-submit --class "HW10_2_Merge" /usr/local/spark/sparkapps/ch10_scala/target/scala-2.12/hw10_2_merge_2.12-1.0.jar</code> 命令对hdfs中的多数据文件进行合并去重，完成后通过cat命令观察生成的新文件，结果如图15所示。</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="865" height="367" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-46.png" alt="" class="wp-image-2340" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-46.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-46-300x127.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-46-768x326.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图15</figcaption></figure>
</div>


<h4 class="wp-block-heading">5.编写独立应用程序实现求平均值问题</h4>



<h5 class="wp-block-heading">每个输入文件表示班级学生某个学科的成绩，每行内容由两个字段组成，第一个是学生名字，第二个是学生的成绩；编写Spark独立应用程序求出所有学生的平均成绩，并输出到一个新文件中。</h5>



<p>5.1在ch10_scala文件夹下创建input_avg文件夹，在其下创建三个文件（本例名为Algorithm/Database/Python，文件内容参考如图16所示，并将该文件夹上传到hdfs的ch10文件夹下，命令如图17所示；</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="865" height="436" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-47.png" alt="" class="wp-image-2341" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-47.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-47-300x151.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-47-768x387.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图16</figcaption></figure>
</div>

<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="865" height="325" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-48.png" alt="" class="wp-image-2342" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-48.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-48-300x113.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-48-768x289.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图17</figcaption></figure>
</div>


<p>5.2在scala文件夹下，创建HW10_3_AvgScore.scala文件，在ch10_scala文件夹下创建HW10_3_AvgScore.sbt文件，创建文件的命令可参考图5；对于HW10_3_AvgScore.scala文件，内容如图18所示，对于HW10_3_AvgScore.sbt文件，内容如图19所示。</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="865" height="614" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-49.png" alt="" class="wp-image-2343" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-49.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-49-300x213.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-49-768x545.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图18</figcaption></figure>
</div>

<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="865" height="218" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-50.png" alt="" class="wp-image-2344" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-50.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-50-300x76.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-50-768x194.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图19</figcaption></figure>
</div>


<p>5.3删除ch10_scala下的target和project文件夹，删除后的内容如图20所示。</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="865" height="129" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-51.png" alt="" class="wp-image-2345" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-51.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-51-300x45.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-51-768x115.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图20</figcaption></figure>
</div>


<p>5.4使用<code>sbt package</code>命令打包后，使用<code>spark-submit --class "HW10_3_AvgScore" /usr/local/spark/sparkapps/ch10_scala/target/scala-2.12/hw10_3_avgscore_2.12-1.0.jar</code> 命令对hdfs中的多数据文件进行合并去重，完成后通过cat命令观察生成的新文件，结果如图21所示。</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="865" height="141" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-52.png" alt="" class="wp-image-2346" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-52.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-52-300x49.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/05/image-52-768x125.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图21</figcaption></figure>
</div>]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>H30505-实验7-Hive的安装与编程实践</title>
		<link>https://www.leexinghai.com/aic/h30505-%e5%ae%9e%e9%aa%8c7-hive%e7%9a%84%e5%ae%89%e8%a3%85%e4%b8%8e%e7%bc%96%e7%a8%8b%e5%ae%9e%e8%b7%b5/</link>
		
		<dc:creator><![CDATA[李星海]]></dc:creator>
		<pubDate>Sat, 22 Apr 2023 08:36:32 +0000</pubDate>
				<category><![CDATA[大数据应用技术]]></category>
		<guid isPermaLink="false">https://aic.leexinghai.com/?p=2183</guid>

					<description><![CDATA[本文分两个部分，第一个部分为Hive配置教程（因为XMU的配置教程有一些不完整，致使本人在配置Hive时走了一 [&#8230;]]]></description>
										<content:encoded><![CDATA[
<p>本文分两个部分，第一个部分为Hive配置教程（因为XMU的配置教程有一些不完整，致使本人在配置Hive时走了一些弯路，在这里简单记录一下）；第二个部分为广州商学院大数据应用技术课程实验7报告参考教程内容。</p>



<h2 class="wp-block-heading">第一部分：Hive配置教程</h2>



<p>1.如果您用的是本人发布的虚拟机，您可以直接如图1所示：从家目录直接获得hive安装文件，将其解压到/usr/local文件夹。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="341" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-131.png" alt="" class="wp-image-2184" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-131.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-131-300x118.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-131-768x303.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图1</figcaption></figure>



<p>2.切换到解压路径下，将解压获得的文件夹更改名称，并赋予hadoop用户权限，操作命令如图2所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="193" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-132.png" alt="" class="wp-image-2185" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-132.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-132-300x67.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-132-768x171.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图2</figcaption></figure>



<p>3.在~/.bashrc文件中添加属于hive的环境变量，因为本教程接续上一个实验，所以只需要更改如图3所示的两个地方即可。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="181" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-133.png" alt="" class="wp-image-2186" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-133.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-133-300x63.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-133-768x161.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图3</figcaption></figure>



<p>4.切换到hive配置路径下，应用默认配置模板文件，并新建立hive-site.xml配置文件，如图4所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="89" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-134.png" alt="" class="wp-image-2187" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-134.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-134-300x31.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-134-768x79.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图4</figcaption></figure>



<p>5.在配置文件中录入如图5所示内容。注意：此处可以添加关闭SSL连接和设置允许公钥检索，可以减少hive启动时的WARNING警告。</p>



<p><strong><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-vivid-green-cyan-color">本文件可以从附件3下载，无需改动。</mark></strong></p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="446" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-135.png" alt="" class="wp-image-2188" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-135.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-135-300x155.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-135-768x396.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图5</figcaption></figure>



<p>6.<strong><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-vivid-red-color">请注意，此处实际并非图中步骤，请以文本内容为准！图片仅供参考！</mark></strong>将<strong><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-vivid-green-cyan-color">附件2</mark></strong>下的mysql-connector-java-8.0.11.tar.gz解压，将获得的jar包复制到hive库文件夹，解压和复制命令与图6-图7同理，<mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-vivid-red-color"><strong>再次提醒：不是图中的5.1.40版本，是8.0.11版本！</strong></mark><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-pale-pink-color"><strong>（为什么？因为本机安装的是mysql-server8.0.33版本，mysql的8.0和5.0版本不兼容，操作命令也不相同）</strong></mark>。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="66" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-136.png" alt="" class="wp-image-2189" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-136.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-136-300x23.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-136-768x59.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图6（请确认此处的版本是8.0.11版本-可以在附件下载）</figcaption></figure>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="48" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-137.png" alt="" class="wp-image-2190" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-137.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-137-300x17.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-137-768x43.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图7（请确认此处的版本是8.0.11版本-可以在附件下载）</figcaption></figure>



<p>7.使用service mysql start启动mysql服务。</p>



<p>8.启动完成后，使用sudo mysql -u root -p（然后按两下回车）登录mysql交互界面，成功登录的结果如图8所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="689" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-138.png" alt="" class="wp-image-2191" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-138.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-138-300x239.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-138-768x612.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图8</figcaption></figure>



<p>9.<strong><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-vivid-red-color">此处因为是mysql8.0版本，授权用户与XMU的mysql5.0命令不相同，如果您的mysql是8.0，以此为准！</mark></strong>创建hive用户，并授权其可以访问所有数据库和所有数据表，完成授权后，使用flush刷新权限，操作命令如图9所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="334" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-139.png" alt="" class="wp-image-2192" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-139.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-139-300x116.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-139-768x297.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图9</figcaption></figure>



<p>10.同样也是因为新版的缘故，在使用jdbc连接的时候，5.0版本已经不适用，需要在hadoop文件夹下的库中找到guava-27.0-jre.jar文件，复制到hive文件夹下的库中，并将hive文件夹下的guava-19.0.jar删除，hive才可以正常运行。（这里也走了一个弯路）如图10所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="326" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-140.png" alt="" class="wp-image-2193" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-140.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-140-300x113.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-140-768x289.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图10</figcaption></figure>



<p>11.使用schematool初始化hive，如图11所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="319" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-142.png" alt="" class="wp-image-2195" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-142.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-142-300x111.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-142-768x283.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图11</figcaption></figure>



<p>12.完成初始化后，在终端输入hive即可启动hive，使用show databases;可以查看hive中的数据库，如图12所示，存在一个名为“default”的数据库。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="685" height="205" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-141.png" alt="" class="wp-image-2194" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-141.png 685w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-141-300x90.png 300w" sizes="auto, (max-width: 685px) 100vw, 685px" /><figcaption class="wp-element-caption">图12</figcaption></figure>



<h2 class="wp-block-heading">第二部分：广州商学院大数据应用技术课程实验7报告参考教程内容</h2>



<p><strong>一、实验目的</strong><strong></strong></p>



<ol class="wp-block-list" type="a">
<li>理解Hive作为数据仓库在Hadoop体系结构中的角色；</li>



<li>熟练使用常用的HiveSQL语句。</li>
</ol>



<p><strong>二、实验仪器设备或材料</strong></p>



<ol class="wp-block-list" type="a">
<li>Ubuntu 20.0.4</li>



<li>Hadoop 3.3.1</li>



<li>Hive 3.1.2</li>



<li>JDK 1.8.0_301</li>



<li>Eclipse 2019-12(R)</li>
</ol>



<p><strong>三、实验原理</strong><strong></strong></p>



<ol class="wp-block-list" type="a">
<li>Hive的工作原理</li>



<li>Hive的基本操作</li>



<li>Hive的基本操作</li>
</ol>



<p>create：创建数据库、表、视图</p>



<p>drop：删除数据库、表、视图</p>



<p>alter：修改数据库、表、视图</p>



<p>show：查看数据库、表、视图</p>



<p>describe：描述数据库、表、视图</p>



<p>load：向表中转载数据</p>



<p>select：查询表中的数据</p>



<p><strong>四、实验内容与步骤</strong><strong></strong></p>



<p>1.准备：Hive</p>



<p>（1）下载</p>



<p>（2）安装</p>



<p>（3）配置PATH</p>



<p>（4）配置Hive</p>



<p>2.准备：MySQL</p>



<p>（1）安装</p>



<p>（2）测试</p>



<p>（3）JDBC驱动</p>



<p>3.启动Hive</p>



<p>（1）MySQL为Hive创建数据库</p>



<p>（2）MySQL允许Hive接入</p>



<p>（3）启动Hive</p>



<p>（4）可能问题</p>



<p>（5）退出Hive、Hadoop</p>



<p>4.作业</p>



<p>（1）创建内部表</p>



<p>（2）创建外部分区表</p>



<p>（3）从stock.csv文件向stocks表导入数据</p>



<p>（4）创建未分区外部表</p>



<p>（5）为dividends表插入数据</p>



<p><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-pale-pink-color"><strong>下方作业涉及的数据集可以在附件1获取。</strong></mark></p>



<p>（6）查询IBM（symbol=IBM）从2000年期所有支付股息的交易日的收盘价（price_close）</p>



<p>（7）查询苹果公司在2008年10月每个交易日的涨跌情况</p>



<p>（8）查询stocks表的收盘价比开盘价（price_open）高得最多的那条记录的相关信息18</p>



<p>（9）查询stocks表的苹果公司年平均调整后收盘价（price_adj_close）</p>



<p>（10）查询每年年平均调整后收盘价（price_adj_close）前3名的公司</p>



<p><strong>五、实验结果与分析</strong></p>



<p>（1）创建内部表，命令如图13所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="796" height="665" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-156.png" alt="" class="wp-image-2212" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-156.png 796w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-156-300x251.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-156-768x642.png 768w" sizes="auto, (max-width: 796px) 100vw, 796px" /><figcaption class="wp-element-caption">图13</figcaption></figure>



<p>（2）创建外部分区表，并查看外部分区表的信息，命令如图14-图15所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="316" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-155.png" alt="" class="wp-image-2211" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-155.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-155-300x110.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-155-768x281.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图14</figcaption></figure>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="469" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-154.png" alt="" class="wp-image-2210" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-154.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-154-300x163.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-154-768x416.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图15</figcaption></figure>



<p>（3）从stock.csv文件向stocks表导入数据(截图为从dividends.csv向dividends表导入数据，同理)，命令如图16所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="91" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-153.png" alt="" class="wp-image-2209" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-153.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-153-300x32.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-153-768x81.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图16</figcaption></figure>



<p>（4）创建未分区外部表，命令如图17所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="229" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-152.png" alt="" class="wp-image-2208" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-152.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-152-300x79.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-152-768x203.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图17</figcaption></figure>



<p>（5）为dividends表插入数据，此处需要开启动态分区功能，并允许所有分区字段都可以使用设置动态分区，然后将最大可创建动态分区数设置为1000（其实只要大于365就可以了），命令如图18所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="96" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-151.png" alt="" class="wp-image-2207" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-151.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-151-300x33.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-151-768x85.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图18</figcaption></figure>



<p>（6）查询IBM（symbol=IBM）从2000年期所有支付股息的交易日的收盘价（price_close），数据库查询命令如图19所示，结果如图20所示。<mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-vivid-purple-color">（因数据集中找不到IBM的字段信息，故本教程使用IBCPO字段代替）</mark></p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="229" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-150.png" alt="" class="wp-image-2206" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-150.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-150-300x79.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-150-768x203.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图19</figcaption></figure>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="758" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-149.png" alt="" class="wp-image-2205" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-149.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-149-300x263.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-149-768x673.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图20</figcaption></figure>



<p>（7）查询苹果公司在2008年10月每个交易日的涨跌情况：数据库查询命令和结果如图21所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="710" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-148.png" alt="" class="wp-image-2204" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-148.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-148-300x246.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-148-768x630.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图21</figcaption></figure>



<p>（8）查询stocks表的收盘价比开盘价（price_open）高得最多的那条记录的相关信息，数据库查询命令和结果如图22所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="757" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-147.png" alt="" class="wp-image-2203" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-147.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-147-300x263.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-147-768x672.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图22</figcaption></figure>



<p>（9）查询stocks表的苹果公司年平均调整后收盘价（price_adj_close）数据库查询命令如图24所示，结果如图25所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="341" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-146.png" alt="" class="wp-image-2202" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-146.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-146-300x118.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-146-768x303.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图24</figcaption></figure>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="298" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-145.png" alt="" class="wp-image-2201" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-145.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-145-300x103.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-145-768x265.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图25</figcaption></figure>



<p>（10）查询每年年平均调整后收盘价（price_adj_close）前3名的公司，数据库查询命令如图26所示，结果如图27所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="354" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-144.png" alt="" class="wp-image-2200" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-144.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-144-300x123.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-144-768x314.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图26</figcaption></figure>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="806" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-143.png" alt="" class="wp-image-2199" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-143.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-143-300x280.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-143-768x716.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图27</figcaption></figure>



<h2 class="wp-block-heading">附件下载</h2>



<div class="wp-block-file"><a id="wp-block-file--media-2353451f-add2-45a6-bfc7-6eff4fe5f0ce" href="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/数据集.7z">附件1：数据集</a></div>



<div class="wp-block-file"><a id="wp-block-file--media-d10dcc0f-394f-46a9-8b04-ab8a1caf8844" href="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/mysql-connector-java-8.0.11.7z">附件2：mysql-connector-java-8.0.11</a></div>



<div class="wp-block-file"><a id="wp-block-file--media-dc331a32-6b8a-408a-b7cc-2404f2c0c795" href="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/hive-site.7z">附件3：hive-site.xml配置文件</a></div>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>H30429-实验6-MapReduce的编程实践</title>
		<link>https://www.leexinghai.com/aic/h30429-%e5%ae%9e%e9%aa%8c6-mapreduce%e7%9a%84%e7%bc%96%e7%a8%8b%e5%ae%9e%e8%b7%b5/</link>
		
		<dc:creator><![CDATA[李星海]]></dc:creator>
		<pubDate>Wed, 19 Apr 2023 02:26:16 +0000</pubDate>
				<category><![CDATA[大数据应用技术]]></category>
		<guid isPermaLink="false">https://aic.leexinghai.com/?p=2135</guid>

					<description><![CDATA[一、实验目的 二、实验仪器设备或材料 三、实验原理 Map函数+Reduce函数+Shuffle过程 四、实验 [&#8230;]]]></description>
										<content:encoded><![CDATA[
<p><strong>一、实验目的</strong><strong></strong></p>



<ol class="wp-block-list" type="a">
<li>通过实验掌握基本的MapReduce编程方法；</li>



<li>掌握用MapReduce解决一些常见的数据处理问题，包括数据去重、数据排序和数据挖掘等。</li>
</ol>



<p><strong>二、实验仪器设备或材料</strong><strong></strong></p>



<ol class="wp-block-list" type="a">
<li>Ubuntu 20.0.1</li>



<li>Hadoop 3.3.1（至少完成伪分布模式）</li>



<li>JDK 1.8.0_301</li>



<li>Eclipse 2019-12(R)</li>
</ol>



<p><strong>三、实验原理</strong><strong></strong></p>



<p>Map函数+Reduce函数+Shuffle过程</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="709" height="470" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-103.png" alt="" class="wp-image-2136" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-103.png 709w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-103-300x199.png 300w" sizes="auto, (max-width: 709px) 100vw, 709px" /></figure>



<p><strong>四、实验内容与步骤</strong><strong></strong></p>



<p>1.Java API编程实现文件合并和去重操作</p>



<p>(1)输入</p>



<p>(2)处理</p>



<p>(3)输出</p>



<p>2.Java API编程实现对输入文件的排序6</p>



<p>(1)输入</p>



<p>(2)处理</p>



<p>(3)输出</p>



<p>3.Java API编程实现对给定的表格进行信息挖掘10</p>



<p>(1)输入</p>



<p>(2)处理</p>



<p>(3)输出<strong></strong></p>



<p><strong>五、实验结果与分析</strong></p>



<h2 class="wp-block-heading">0.本教程第1-5步为Java API编程实现文件合并和去重操作；7-10步为Java API编程实现对输入文件的排序；12-15步为Java API编程实现对给定的表格进行信息挖掘。相关输入文件和类文件请至附件下载。</h2>



<p>1.启动hadoop和hbase，命令和结果如图1所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="468" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-104.png" alt="" class="wp-image-2137" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-104.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-104-300x162.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-104-768x416.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图1</figcaption></figure>



<p>2.在家目录下创建两个输入源文件，这里以ch701.txt和ch702.txt，两个文件内容可以在实验指导书中获取，也可以自己自定义文本内容，从实验指导书获取的文本如图2所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="680" height="575" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-105.png" alt="" class="wp-image-2138" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-105.png 680w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-105-300x254.png 300w" sizes="auto, (max-width: 680px) 100vw, 680px" /><figcaption class="wp-element-caption">图2</figcaption></figure>



<p>3.在hdfs上创建本例实验文件夹，然后将两个文件上传到实验文件夹内，操作命令如图3所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="266" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-106.png" alt="" class="wp-image-2139" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-106.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-106-300x92.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-106-768x236.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图3</figcaption></figure>



<p>4.创建名为Merge的类文件，代码可以从实验指导书中获取，编译完成后导出为可运行的JAR文件，导出操作步骤如图4-图5所示，导出完成后，使用hadoop的jar命令调用相应的jar包进行数据处理，命令如图6所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="568" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-107.png" alt="" class="wp-image-2140" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-107.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-107-300x197.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-107-768x504.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图4</figcaption></figure>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="411" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-108.png" alt="" class="wp-image-2141" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-108.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-108-300x143.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-108-768x365.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图5</figcaption></figure>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="23" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-109.png" alt="" class="wp-image-2142" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-109.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-109-300x8.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-109-768x20.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图6</figcaption></figure>



<p>5.执行完成后，在hdfs上使用cat命令回显输出结果的文件夹，显示已经成功去除了重复的数据项，如图7所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="451" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-110.png" alt="" class="wp-image-2143" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-110.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-110-300x156.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-110-768x400.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图7</figcaption></figure>



<h2 class="wp-block-heading">6.接下来的内容为Java API编程实现对输入文件的排序。</h2>



<p>7.在家目录下创建三个输入源文件，这里以ch703.txt、ch704.txt、ch705.txt为例子命名，三个文件内容可以在实验指导书中获取，也可以自己自定义文本内容，从实验指导书获取的文本如图8所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="812" height="478" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-111.png" alt="" class="wp-image-2144" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-111.png 812w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-111-300x177.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-111-768x452.png 768w" sizes="auto, (max-width: 812px) 100vw, 812px" /><figcaption class="wp-element-caption">图8</figcaption></figure>



<p>8.创建名为MergeSort的类文件，如图9所示；代码可以从实验指导书中获取，编译完成后导出为可运行的JAR文件，导出操作步骤可参考图4-图5的内容，导出结果界面如图10所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="636" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-112.png" alt="" class="wp-image-2145" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-112.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-112-300x221.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-112-768x565.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图9</figcaption></figure>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="657" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-113.png" alt="" class="wp-image-2146" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-113.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-113-300x228.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-113-768x583.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图10</figcaption></figure>



<p>9. 在hdfs上创建本例实验文件夹，然后将三个文件上传到实验文件夹内，操作命令如图11所示；导出完成后，使用hadoop的jar命令调用相应的jar包进行数据处理，命令如图12所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="337" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-114.png" alt="" class="wp-image-2147" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-114.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-114-300x117.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-114-768x299.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图11</figcaption></figure>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="22" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-115.png" alt="" class="wp-image-2148" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-115.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-115-300x8.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-115-768x20.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图12</figcaption></figure>



<p>10.执行完成后，在hdfs上使用cat命令回显输出结果的文件夹，显示已经成功进行文本排序操作，如图13所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="433" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-116.png" alt="" class="wp-image-2149" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-116.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-116-300x150.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-116-768x384.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图13</figcaption></figure>



<h2 class="wp-block-heading">11.接下来的内容为Java API编程实现对给定的表格进行信息挖掘。</h2>



<p>12.在家目录下创建输入源文件，这里以input_mining.txt为例子命名，文件内容可以在实验指导书中获取，也可以自己自定义文本内容(满足类似child parent格式需求就可以了)，从实验指导书获取的文本如图14所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="619" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-117.png" alt="" class="wp-image-2150" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-117.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-117-300x215.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-117-768x550.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图14</figcaption></figure>



<p>13.创建DataMining类文件，大致代码如图15所示，文件的导出打包可以参见本例前序步骤。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="516" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-118.png" alt="" class="wp-image-2151" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-118.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-118-300x179.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-118-768x458.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图15</figcaption></figure>



<p>14.在hdfs上创建本例实验文件夹，然后将文件上传到实验文件夹内，操作命令如图16所示；导出完成后，使用hadoop的jar命令调用相应的jar包进行数据处理，命令如图17所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="80" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-119.png" alt="" class="wp-image-2152" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-119.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-119-300x28.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-119-768x71.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图16</figcaption></figure>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="164" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-120.png" alt="" class="wp-image-2153" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-120.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-120-300x57.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-120-768x146.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图17</figcaption></figure>



<p>15.执行完成后，在hdfs上使用cat命令回显输出结果的文件夹，显示已经成功进行了数据关系挖掘，如图18所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="456" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-121.png" alt="" class="wp-image-2154" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-121.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-121-300x158.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-121-768x405.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图18</figcaption></figure>



<h2 class="wp-block-heading">附件下载</h2>



<div class="wp-block-file"><a id="wp-block-file--media-18a3d30c-965b-44d5-b6f5-83313a8067f7" href="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/实验6数据文件.7z">实验6数据文件</a><a href="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/实验6数据文件.7z" class="wp-block-file__button wp-element-button" download aria-describedby="wp-block-file--media-18a3d30c-965b-44d5-b6f5-83313a8067f7">下载</a></div>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>H30501-实验5-WordCount</title>
		<link>https://www.leexinghai.com/aic/h30501-%e5%ae%9e%e9%aa%8c5-wordcount/</link>
		
		<dc:creator><![CDATA[李星海]]></dc:creator>
		<pubDate>Wed, 12 Apr 2023 00:50:01 +0000</pubDate>
				<category><![CDATA[大数据应用技术]]></category>
		<guid isPermaLink="false">https://aic.leexinghai.com/?p=2091</guid>

					<description><![CDATA[一、实验目的 二、实验仪器设备或材料 三、实验原理 首先，在Linux系统本地创建两个文件，即文件wordfi [&#8230;]]]></description>
										<content:encoded><![CDATA[
<p><strong>一、实验目的</strong><strong></strong></p>



<ol class="wp-block-list" type="a">
<li>通过实验掌握基本的MapReduce编程方法；</li>



<li>掌握用MapReduce解决一些常见的数据处理问题，包括数据去重、数据排序和数据挖掘等。</li>
</ol>



<p><strong>二、实验仪器设备或材料</strong><strong></strong></p>



<ol class="wp-block-list" type="a">
<li>Ubuntu 20.0.1</li>



<li>Hadoop 3.3.1（至少完成伪分布模式）</li>
</ol>



<ul class="wp-block-list">
<li>JDK 1.8.0_301</li>



<li>Eclipse 2019-12(R)</li>
</ul>



<p><strong>三、实验原理</strong><strong></strong></p>



<p>首先，在Linux系统本地创建两个文件，即文件wordfile1.txt和wordfile2.txt。在实际应用中，这两个文件可能会非常大，会被分布存储到多个节点上。但是，为了简化任务，这里的两个文件只包含几行简单的内容。需要说明的是，针对这两个小数据集样本编写的MapReduce词频统计程序，不作任何修改，就可以用来处理大规模数据集的词频统计。</p>



<p>文件wordfile1.txt的内容如下：</p>



<p>I love Spark</p>



<p>I love Hadoop</p>



<p>文件wordfile2.txt的内容如下：</p>



<p>Hadoop is good</p>



<p>Spark is fast</p>



<p>假设HDFS中有一个/user/hadoop/input文件夹，并且文件夹为空，请把文件wordfile1.txt和wordfile2.txt上传到HDFS中的input文件夹下。现在需要设计一个词频统计程序，统计input文件夹下所有文件中每个单词的出现次数，也就是说，程序应该输出如下形式的结果：</p>



<p>fast&nbsp; 1</p>



<p>good&nbsp;&nbsp; 1</p>



<p>Hadoop&nbsp;&nbsp; 2</p>



<p>I&nbsp;&nbsp;&nbsp; 2</p>



<p>is&nbsp;&nbsp; 2</p>



<p>love&nbsp;&nbsp; 2</p>



<p>Spark&nbsp;&nbsp; 2</p>



<p><strong>四、实验内容与步骤</strong><strong></strong></p>



<p>1.输入</p>



<p>2.处理</p>



<p>（1）Map的处理逻辑</p>



<p>（2）Reduce的处理逻辑</p>



<p>（3）main()测试</p>



<p>3.输出</p>



<p>（1）编译打包</p>



<p>（2）运行jar<strong></strong></p>



<p><strong>五、实验结果与分析</strong></p>



<p>1.创建文件wordfile1和wordfile2两个文件，命令如图1所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="804" height="69" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-78.png" alt="" class="wp-image-2092" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-78.png 804w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-78-300x26.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-78-768x66.png 768w" sizes="auto, (max-width: 804px) 100vw, 804px" /><figcaption class="wp-element-caption">图1</figcaption></figure>



<p>2.在两个文件中输入一些文本，这里以实验要求文本为要求，回显文件内容如图2所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="121" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-82.png" alt="" class="wp-image-2096" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-82.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-82-300x42.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-82-768x107.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图2</figcaption></figure>



<p>3.在Eclipse中新建一个WordCount的类，然后通过JAVA API编程进行打包（代码可至文末附件下载）。上传到Hadoop服务容器内，代码部分内容如图3所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="520" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-79.png" alt="" class="wp-image-2093" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-79.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-79-300x180.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-79-768x462.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图3</figcaption></figure>



<p>4.通过dfs命令删除Hadoop容器内的input output文件夹，重新创建input文件夹（这里也可通过删除input下所有文件完成），上传wordfile1,wordfile2到input文件夹。最后通过调用本地创建好的Jar包文件读取input文件夹中文件内容进行词频统计，结果输出到output文件夹中（如果已有同名文件夹，会输出失败），过程命令如图4所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="303" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-80.png" alt="" class="wp-image-2094" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-80.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-80-300x105.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-80-768x269.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图4</figcaption></figure>



<p>5.使用cat命令回显output文件夹下的内容，可见成功完成词频统计。如图5所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="333" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-81.png" alt="" class="wp-image-2095" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-81.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-81-300x115.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-81-768x296.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图5</figcaption></figure>



<h2 class="wp-block-heading">附件下载</h2>



<div class="wp-block-file"><a id="wp-block-file--media-56164d9b-6b13-42a6-9607-5611aec8dc2d" href="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/WordCount.7z">WordCount</a><a href="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/WordCount.7z" class="wp-block-file__button wp-element-button" download aria-describedby="wp-block-file--media-56164d9b-6b13-42a6-9607-5611aec8dc2d">下载</a></div>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>H30420-Hbase配置教程和实验4</title>
		<link>https://www.leexinghai.com/aic/h30420-hbase%e9%85%8d%e7%bd%ae%e6%95%99%e7%a8%8b%e5%92%8c%e5%ae%9e%e9%aa%8c4/</link>
		
		<dc:creator><![CDATA[李星海]]></dc:creator>
		<pubDate>Sun, 09 Apr 2023 03:47:07 +0000</pubDate>
				<category><![CDATA[大数据应用技术]]></category>
		<guid isPermaLink="false">https://aic.leexinghai.com/?p=2033</guid>

					<description><![CDATA[本文分两个部分，第一个部分为Hbase配置教程（因为XMU的配置教程有一些不完整，致使本人在配置Hbase时走 [&#8230;]]]></description>
										<content:encoded><![CDATA[
<p>本文分两个部分，第一个部分为Hbase配置教程（因为XMU的配置教程有一些不完整，致使本人在配置Hbase时走了一些弯路，在这里简单记录一下）；第二个部分为广州商学院大数据应用技术课程实验4报告参考教程内容。</p>



<h1 class="wp-block-heading">第一部分-Hbase配置教程</h1>



<p>0.如果您用的是我制作的虚拟机镜像，或者是从周老师那获得的UbuntuKylin16.04的虚拟机镜像，您可以跟着此教程完成，否则，请您先下载hbase-2.2.2-bin.tar.gz到home目录下，并跟着之前的教程完成hadoop的配置，才可以跟着此教程完成配置。</p>



<p>1.在hadoop用户的家目录下，打开终端，输入tar命令解压hbase2.2.2，并进行赋予权限和改名的操作，最后将其移动到/usr/local/文件夹下，参考命令如图1所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="314" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-35.png" alt="" class="wp-image-2035" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-35.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-35-300x109.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-35-768x279.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图1</figcaption></figure>



<p>2.（可选，偷懒可以跳过）使用vim ~/.bashrc编辑环境变量，在PATH下加入hbase文件夹的bin路径，路径内容可参考图2进行配置。完成配置后，使用source命令编译环境变量文件，并尝试输入 hbase version查看版本号，如果配置正确，hbase会回显正确版本号，如图3所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="109" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-36.png" alt="" class="wp-image-2037" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-36.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-36-300x38.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-36-768x97.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图2</figcaption></figure>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="269" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-38.png" alt="" class="wp-image-2039" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-38.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-38-300x93.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-38-768x239.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图3</figcaption></figure>



<p>3.编辑hbase环境文件，命令如图4所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="794" height="35" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-39.png" alt="" class="wp-image-2040" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-39.png 794w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-39-300x13.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-39-768x34.png 768w" sizes="auto, (max-width: 794px) 100vw, 794px" /><figcaption class="wp-element-caption">图4</figcaption></figure>



<p>4.移除28，31，126，139行注释，并正确配置28，31行的路径，具体路径可以参考图5内的路径显示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="183" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-41.png" alt="" class="wp-image-2042" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-41.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-41-300x63.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-41-768x162.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图5</figcaption></figure>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="563" height="135" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-42.png" alt="" class="wp-image-2043" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-42.png 563w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-42-300x72.png 300w" sizes="auto, (max-width: 563px) 100vw, 563px" /><figcaption class="wp-element-caption">图6</figcaption></figure>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="83" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-43.png" alt="" class="wp-image-2044" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-43.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-43-300x29.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-43-768x74.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图7</figcaption></figure>



<p>5.返回hadoop的core-site.xml,记住namenode配置的ip地址和端口号，方便后续使用，命令如图8所示，需要记住的内容如图9所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="26" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-44.png" alt="" class="wp-image-2045" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-44.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-44-300x9.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-44-768x23.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图8</figcaption></figure>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="209" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-45.png" alt="" class="wp-image-2046" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-45.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-45-300x72.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-45-768x186.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图9</figcaption></figure>



<p>6.编辑hbase-site.xml，命令如图10所示，修改configuration中的内容如图11所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="817" height="36" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-46.png" alt="" class="wp-image-2047" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-46.png 817w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-46-300x13.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-46-768x34.png 768w" sizes="auto, (max-width: 817px) 100vw, 817px" /><figcaption class="wp-element-caption">图10</figcaption></figure>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="340" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-47.png" alt="" class="wp-image-2048" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-47.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-47-300x118.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-47-768x302.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图11</figcaption></figure>



<p>7.启动hadoop和hbase，启动完成后使用jps查看进程，如果配置正确，应该会看到Hmaster服务。如图12所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="303" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-48.png" alt="" class="wp-image-2049" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-48.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-48-300x105.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-48-768x269.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图12</figcaption></figure>



<p>8.使用 /usr/local/hbase/bin/hbase shell 进入hbase的shell，尝试创建一个数据库，名称为teacherLXH，如果创建成功，会回显创建完成的提示。具体命令如图13所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="88" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-49.png" alt="" class="wp-image-2050" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-49.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-49-300x31.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-49-768x78.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图13</figcaption></figure>



<p>9.使用Eclipse进行数据库新建和添加数据操作：可以用之前的HDFS1项目，新建一个类。代码可以从附件下载，也可以在<a href="https://dblab.xmu.edu.cn/blog/2442/">HBase2.2.2安装和编程实践指南_厦大数据库实验室博客 (xmu.edu.cn)</a>复制ExampleForHBase.java处的代码指令，复制完成后，修改地址和端口号为图9处显示的内容，代码区域如图14所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="182" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-50.png" alt="" class="wp-image-2051" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-50.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-50-300x63.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-50-768x162.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图14</figcaption></figure>



<p>10.此时代码会有很多报错，是因为没有导入hbase的jar包。在HDFS1项目右键属性，进入Java Build Path，在Libraries处导入包，点击“Add External JARs”，进入到“/usr/local/hbase/lib”目录，选中该目录下的所有jar文件（注意，不要选中client-facing-thirdparty、ruby、shaded-clients和zkcli这四个目录），然后，点击界面底部的“OK”按钮。然后再到client-facing-thirdparty文件夹下选中该目录下所有jar文件。如图15所示</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="580" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-51.png" alt="" class="wp-image-2052" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-51.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-51-300x201.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-51-768x515.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图15</figcaption></figure>



<p>11.导入完成后，执行代码（Run as Java Application），可以正确在hbase创建一个student数据库并添加一些数据，如图16所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="269" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-52.png" alt="" class="wp-image-2053" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-52.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-52-300x93.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-52-768x239.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图16</figcaption></figure>



<h1 class="wp-block-heading">第二部分-实验4</h1>



<h3 class="wp-block-heading"><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-pale-pink-color">（此部分内容建议您先完成第一部分内容再进行参考）</mark></h3>



<p><strong>一、实验目的</strong><strong></strong></p>



<p>a)&nbsp;&nbsp;&nbsp; 理解HBase在Hadoop体系结构中的角色。</p>



<p>b)&nbsp;&nbsp;&nbsp; 熟练使用HBase操作常用的shell命令。</p>



<p>c)&nbsp;&nbsp;&nbsp; 熟悉HBase操作常用的Java API。</p>



<p><strong>二、实验仪器设备或材料</strong><strong></strong></p>



<p>a)&nbsp;&nbsp;&nbsp; Ubuntu 20.0.4</p>



<p>b)&nbsp;&nbsp;&nbsp; Hadoop 3.3.1</p>



<p>c)&nbsp;&nbsp;&nbsp; HBase 2.4.6</p>



<p>d)&nbsp;&nbsp;&nbsp; JDK 1.8.3</p>



<p>e)&nbsp;&nbsp;&nbsp; Eclipse R（2019-12）</p>



<p><strong>三、实验原理</strong><strong></strong></p>



<p>HBase是一个分布式的、面向列的开源数据库，源于Google的一篇论文《BigTable：一个结构化数据的分布式存储系统》。HBase以表的形式存储数据，表有行和列组成，列划分为若干个列族/列簇(column family)。欲了解HBase的官方资讯，请访问HBase官方网站http://hbase.apache.org。</p>



<p>HBase的运行有三种模式：单机模式、伪分布式模式、分布式模式。</p>



<p>a)&nbsp;&nbsp;&nbsp; 单机模式：在一台计算机上安装和使用HBase，不涉及数据的分布式存储；</p>



<p>b)&nbsp;&nbsp;&nbsp; 伪分布式模式：在一台计算机上模拟一个小的集群；</p>



<p>c)&nbsp;&nbsp;&nbsp; 分布式模式：使用多台计算机实现物理意义上的分布式存储。</p>



<p>这里出于学习目的，我们只重点讨论单机模式和伪分布式模式。</p>



<p><strong>四、实验内容与步骤</strong><strong></strong></p>



<p>1.使用HBase Shell命令完成以下功能</p>



<p>1)&nbsp;&nbsp;&nbsp; 创建表student（行键为sid）</p>



<p>2)&nbsp;&nbsp;&nbsp; 为表student增加数据（数据类型为字符串）</p>



<p>3)&nbsp;&nbsp;&nbsp; 为表student修改数据（数据类型为字符串）</p>



<p>4)&nbsp;&nbsp;&nbsp; 查看student表的数据</p>



<p>5)&nbsp;&nbsp;&nbsp; 为表teacher删除数据</p>



<p>6)&nbsp;&nbsp;&nbsp; 为teacher表清空数据</p>



<p>7)&nbsp;&nbsp;&nbsp; 查看HBase数据库中所有的表</p>



<p>8)&nbsp;&nbsp;&nbsp; 删除teacher</p>



<p>2.使用JAVA API编程完成以下功能</p>



<p>1)&nbsp;&nbsp;&nbsp; createTable(String tableName, String[] fields)</p>



<p>2)&nbsp;&nbsp;&nbsp; addRecord(String tableName, String row, String[] fields, String[] values)</p>



<p>3)&nbsp;&nbsp;&nbsp; scanColumn(String tableName, String column)</p>



<p>4)&nbsp;&nbsp;&nbsp; modifyData(String tableName, String row, String column, String value)</p>



<p>5)&nbsp;&nbsp;&nbsp; deleteRow(String tableName, String row)</p>



<p>6)&nbsp;&nbsp;&nbsp; 测试main()</p>



<p><strong>五、实验结果与分析</strong></p>



<p>1.使用HBase Shell命令完成以下功能</p>



<p>1)&nbsp;&nbsp;&nbsp; 创建表student（行键为sid）</p>



<p>2)&nbsp;&nbsp;&nbsp; 为表student增加数据（数据类型为字符串）</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="864" height="294" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-53.png" alt="" class="wp-image-2054" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-53.png 864w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-53-300x102.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-53-768x261.png 768w" sizes="auto, (max-width: 864px) 100vw, 864px" /></figure>



<p>3)&nbsp;&nbsp;&nbsp; 为表student修改数据（数据类型为字符串）</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="587" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-54.png" alt="" class="wp-image-2055" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-54.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-54-300x204.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-54-768x521.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /></figure>



<p>4)&nbsp;&nbsp;&nbsp; 查看student表的数据</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="879" height="319" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-55.png" alt="" class="wp-image-2056" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-55.png 879w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-55-300x109.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-55-768x279.png 768w" sizes="auto, (max-width: 879px) 100vw, 879px" /></figure>



<p>5)&nbsp;&nbsp;&nbsp; 为表teacher删除数据</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="460" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-56.png" alt="" class="wp-image-2057" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-56.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-56-300x160.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-56-768x408.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /></figure>



<p>6)&nbsp;&nbsp;&nbsp; 为teacher表清空数据</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="391" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-57.png" alt="" class="wp-image-2058" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-57.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-57-300x136.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-57-768x347.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /></figure>



<p>7)&nbsp;&nbsp;&nbsp; 查看HBase数据库中所有的表</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="257" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-58.png" alt="" class="wp-image-2059" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-58.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-58-300x89.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-58-768x228.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /></figure>



<p>8)&nbsp;&nbsp;&nbsp; 删除teacher</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="283" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-59.png" alt="" class="wp-image-2060" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-59.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-59-300x98.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-59-768x251.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /></figure>



<p>2.使用JAVA API编程完成以下功能</p>



<p>1)&nbsp;&nbsp;&nbsp; createTable(String tableName, String[] fields)</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="308" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-60.png" alt="" class="wp-image-2061" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-60.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-60-300x107.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-60-768x273.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /></figure>



<p>2)&nbsp;&nbsp;&nbsp; addRecord(String tableName, String row, String[] fields, String[] values)</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="295" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-61.png" alt="" class="wp-image-2062" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-61.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-61-300x102.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-61-768x262.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /></figure>



<p>3)&nbsp;&nbsp;&nbsp; scanColumn(String tableName, String column)</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="255" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-62.png" alt="" class="wp-image-2063" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-62.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-62-300x88.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-62-768x226.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /></figure>



<p>4)&nbsp;&nbsp;&nbsp; modifyData(String tableName, String row, String column, String value)</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="409" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-63.png" alt="" class="wp-image-2064" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-63.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-63-300x142.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-63-768x363.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /></figure>



<p>5)&nbsp;&nbsp;&nbsp; deleteRow(String tableName, String row)</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="282" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-64.png" alt="" class="wp-image-2065" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-64.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-64-300x98.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-64-768x250.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /></figure>



<p>6)&nbsp;&nbsp;&nbsp; 测试main()</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="589" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-65.png" alt="" class="wp-image-2066" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-65.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-65-300x204.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-65-768x523.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /></figure>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="740" height="828" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-66.png" alt="" class="wp-image-2067" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-66.png 740w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-66-268x300.png 268w" sizes="auto, (max-width: 740px) 100vw, 740px" /></figure>



<h2 class="wp-block-heading">附件下载</h2>



<div class="wp-block-file"><a id="wp-block-file--media-0b893e3c-927e-406b-b0c7-b488d1b579aa" href="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/实验4参考代码.7z">实验4参考代码</a><a href="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/实验4参考代码.7z" class="wp-block-file__button wp-element-button" download aria-describedby="wp-block-file--media-0b893e3c-927e-406b-b0c7-b488d1b579aa">下载</a></div>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>H30401D-课堂表现加分作业1</title>
		<link>https://www.leexinghai.com/aic/h30401d-%e8%af%be%e5%a0%82%e8%a1%a8%e7%8e%b0%e5%8a%a0%e5%88%86%e4%bd%9c%e4%b8%9a1/</link>
		
		<dc:creator><![CDATA[李星海]]></dc:creator>
		<pubDate>Sat, 01 Apr 2023 07:26:52 +0000</pubDate>
				<category><![CDATA[大数据应用技术]]></category>
		<guid isPermaLink="false">https://aic.leexinghai.com/?p=1982</guid>

					<description><![CDATA[作业内容 1题目：本地机eclipse链接虚拟机伪分布集群hadoop进行开发（10分） 说明： 1）本作业作 [&#8230;]]]></description>
										<content:encoded><![CDATA[
<h2 class="wp-block-heading">作业内容</h2>



<p>1题目：本地机eclipse链接虚拟机伪分布集群hadoop进行开发（10分）</p>



<p>说明：</p>



<p>1）本作业作为课堂表现加分项目，完成+10，自愿参与；</p>



<p>2）可参考超星平台“资料”--&gt;“***软件”中的相应文档,以及首页课程视频 2.1 或&nbsp; https://www.bilibili.com/video/BV1ZY4y1C7cV/?vd_source=4e2a2c6e225cdb4b611a9f728016773f</p>



<p>3)提交作业只需要截取本地机eclipse运行案例结果即可（5.0分）</p>



<h2 class="wp-block-heading">参考步骤</h2>



<p>1.通过ifconfig观察虚拟机IP，如图1所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="377" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image.png" alt="" class="wp-image-1983" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-300x131.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-768x335.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图1</figcaption></figure>



<p>2.编辑Hadoop的core-site.xml文件，命令如图2所示；编辑内容参考图3。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="35" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-1.png" alt="" class="wp-image-1984" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-1.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-1-300x12.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-1-768x31.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图2</figcaption></figure>



<p></p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="397" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-2.png" alt="" class="wp-image-1985" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-2.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-2-300x138.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-2-768x352.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图3</figcaption></figure>



<p></p>



<p>3.下载Eclipse（本例使用2019版本进行演示）：</p>



<p><a href="https://archive.eclipse.org/technology/epp/downloads/release/2019-06/R/eclipse-jee-2019-06-R-win32-x86_64.zip">https://archive.eclipse.org/technology/epp/downloads/release/2019-06/R/eclipse-jee-2019-06-R-win32-x86_64.zip</a></p>



<p>4.解压后打开，将从学习通上下载的hadoop-eclipse-plugi.jar放到eclipse的plugins目录下，如图4所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="696" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-3.png" alt="" class="wp-image-1986" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-3.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-3-300x241.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-3-768x618.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图4</figcaption></figure>



<p></p>



<p>5.也从学习通上下载hadoop-3.1.3.tar.gz文件，解压到任意目录（本例解压到下载目录下，如图5所示）</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="398" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-4.png" alt="" class="wp-image-1987" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-4.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-4-300x138.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-4-768x353.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图5</figcaption></figure>



<p></p>



<p>6.打开eclipse，在Window-Preferences面板左侧找到Hadoop Map/Reduce，路径选择您刚刚解压的文件夹路径，操作过程如图6-图7所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="353" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-5.png" alt="" class="wp-image-1988" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-5.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-5-300x122.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-5-768x313.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图6</figcaption></figure>



<p></p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="629" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-6.png" alt="" class="wp-image-1989" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-6.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-6-300x218.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-6-768x558.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图7</figcaption></figure>



<p></p>



<p>7.然后点击apply and close按钮保存关闭。</p>



<p>8.在Window-Show View-Other选中MapReduce Tools的下拉列表，选中Map/Reduce Locations 打开，如图8-图9所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="820" height="416" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-7.png" alt="" class="wp-image-1990" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-7.png 820w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-7-300x152.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-7-768x390.png 768w" sizes="auto, (max-width: 820px) 100vw, 820px" /><figcaption class="wp-element-caption">图8</figcaption></figure>



<p></p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="700" height="787" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-8.png" alt="" class="wp-image-1991" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-8.png 700w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-8-267x300.png 267w" sizes="auto, (max-width: 700px) 100vw, 700px" /><figcaption class="wp-element-caption">图9</figcaption></figure>



<p></p>



<p>10.完成后会在Eclipse界面下方出现Map/Reduce Locations界面，右键点击New Map/Reduce Location，然后分别设置Location name；Host；Port；User name为您的虚拟机中的参数，如图10所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="575" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-9.png" alt="" class="wp-image-1992" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-9.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-9-300x199.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-9-768x511.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图10</figcaption></figure>



<p></p>



<p>11.如果你的FileExplorer里面没有项目，就新建一个，确保DFS Locations可以正常显示。如图11-13所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="522" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-10.png" alt="" class="wp-image-1993" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-10.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-10-300x181.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-10-768x463.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图11</figcaption></figure>



<p></p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="776" height="770" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-11.png" alt="" class="wp-image-1994" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-11.png 776w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-11-300x298.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-11-150x150.png 150w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-11-768x762.png 768w" sizes="auto, (max-width: 776px) 100vw, 776px" /><figcaption class="wp-element-caption">图12</figcaption></figure>



<p></p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="863" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-12.png" alt="" class="wp-image-1995" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-12.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-12-300x300.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-12-150x150.png 150w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-12-768x766.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图13</figcaption></figure>



<p></p>



<p>12.新建完成后，Project Explorer会有您的项目，此时如果没出现DFS Locations列表的话，点击右上角窗格图标，选择Map/Reduce，此时应该可以正确显示，如图14所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="415" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-13.png" alt="" class="wp-image-1996" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-13.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-13-300x144.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/04/image-13-768x368.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图14</figcaption></figure>



<p></p>



<p>13.本次教程至此完成。</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>H30222-《大数据应用技术》课程简易环境配置(VMware方法-hadoop和eclipse)</title>
		<link>https://www.leexinghai.com/aic/h30222-%e3%80%8a%e5%a4%a7%e6%95%b0%e6%8d%ae%e5%ba%94%e7%94%a8%e6%8a%80%e6%9c%af%e3%80%8b%e8%af%be%e7%a8%8b%e7%ae%80%e6%98%93%e7%8e%af%e5%a2%83%e9%85%8d%e7%bd%aevmware%e6%96%b9%e6%b3%95/</link>
					<comments>https://www.leexinghai.com/aic/h30222-%e3%80%8a%e5%a4%a7%e6%95%b0%e6%8d%ae%e5%ba%94%e7%94%a8%e6%8a%80%e6%9c%af%e3%80%8b%e8%af%be%e7%a8%8b%e7%ae%80%e6%98%93%e7%8e%af%e5%a2%83%e9%85%8d%e7%bd%aevmware%e6%96%b9%e6%b3%95/#comments</comments>
		
		<dc:creator><![CDATA[李星海]]></dc:creator>
		<pubDate>Wed, 22 Feb 2023 09:20:29 +0000</pubDate>
				<category><![CDATA[大数据应用技术]]></category>
		<guid isPermaLink="false">https://aic.leexinghai.com/?p=1789</guid>

					<description><![CDATA[写在前面：本文教程图片和描述文字（除部分文字外）为李星海创作。配置均在实机进行。本教程适用于【广州商学院】【2 [&#8230;]]]></description>
										<content:encoded><![CDATA[
<figure class="wp-block-pullquote"><blockquote><p>写在前面：本文教程图片和描述文字（除部分文字外）为李星海创作。配置均在实机进行。<br>本教程适用于【广州商学院】【2020级】【信息技术与工程学院】【计算机科学与技术（专升本）】【2022-2023学年第2学期】《大数据应用技术》课程环境。<br>参考资料：<br><a href="https://dblab.xmu.edu.cn/blog/2441/" data-type="URL" data-id="https://dblab.xmu.edu.cn/blog/2441/">Hadoop3.1.3安装教程_单机/伪分布式配置_Hadoop3.1.3/Ubuntu18.04(16.04)</a><br><a rel="noreferrer noopener" href="https://cxybb.com/article/ACK_ACK/122456308#:~:text=%E6%8A%80%E6%9C%AF%E6%A0%87%E7%AD%BE%EF%BC%9A%20%E9%98%BF%E9%87%8C%E4%BA%91%20Linux%20ubuntu%20Kylin%2016.04%20%E6%8D%A2%E6%BA%90%20kylin,1%E3%80%81%20%E5%A4%87%E4%BB%BD%E5%8E%9F%E6%9D%A5%E7%9A%84%E6%BA%90%20%28%E5%A5%BD%E4%B9%A0%E6%83%AF%29%20cd%20%2Fetc%2Fapt%2F%20cp%20sources.list%20sources.list.bak" data-type="URL" data-id="https://cxybb.com/article/ACK_ACK/122456308#:~:text=%E6%8A%80%E6%9C%AF%E6%A0%87%E7%AD%BE%EF%BC%9A%20%E9%98%BF%E9%87%8C%E4%BA%91%20Linux%20ubuntu%20Kylin%2016.04%20%E6%8D%A2%E6%BA%90%20kylin,1%E3%80%81%20%E5%A4%87%E4%BB%BD%E5%8E%9F%E6%9D%A5%E7%9A%84%E6%BA%90%20%28%E5%A5%BD%E4%B9%A0%E6%83%AF%29%20cd%20%2Fetc%2Fapt%2F%20cp%20sources.list%20sources.list.bak" target="_blank">Ubuntu (Kylin) 16.04 换源阿里云_ubuntukylin源_Pou光明的博客-程序员宝宝</a></p></blockquote></figure>



<p>0.前置准备环境：将提供的软件安装包下载到本机，本教程中以放到D盘的【DSJSOFT】文件夹为例，如图1所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="746" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-2.png" alt="" class="wp-image-1793" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-2.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-2-300x259.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-2-768x662.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /></figure>



<p>1.在VMware新建虚拟机，读取文件夹下的ubuntukylin安装镜像文件，如图2所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="713" height="689" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-3.png" alt="" class="wp-image-1795" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-3.png 713w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-3-300x290.png 300w" sizes="auto, (max-width: 713px) 100vw, 713px" /><figcaption class="wp-element-caption">图2 在安装程序光盘映像文件中指定ISO路径</figcaption></figure>



<p>2.指定Linux全名，用户名，密码（密码为jk2005），如图3所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="713" height="689" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-4.png" alt="" class="wp-image-1796" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-4.png 713w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-4-300x290.png 300w" sizes="auto, (max-width: 713px) 100vw, 713px" /><figcaption class="wp-element-caption">图3</figcaption></figure>



<p>3.指定安装大小，这里建议40G，如图4所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="713" height="689" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-5.png" alt="" class="wp-image-1797" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-5.png 713w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-5-300x290.png 300w" sizes="auto, (max-width: 713px) 100vw, 713px" /><figcaption class="wp-element-caption">图4</figcaption></figure>



<p>4.确保网络适配器处于“NAT”模式，如图5所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="508" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-6.png" alt="" class="wp-image-1798" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-6.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-6-300x176.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-6-768x451.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图5</figcaption></figure>



<p>5.进入安装过程，此处等待20分钟左右，耐心等就好，会自动进入桌面，如图6所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="549" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-7.png" alt="" class="wp-image-1799" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-7.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-7-300x190.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-7-768x487.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图6 安装界面</figcaption></figure>



<p>6.完成安装之后，就显示了刚刚创建的账户，通过点击GZSXY进行登录，如图7-图8所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="563" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-8.png" alt="" class="wp-image-1800" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-8.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-8-300x195.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-8-768x500.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图7 登陆界面</figcaption></figure>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="568" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-9.png" alt="" class="wp-image-1801" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-9.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-9-300x197.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-9-768x504.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图8输入密码界面</figcaption></figure>



<p>7.指定root账户密码，如图9所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="670" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-10.png" alt="" class="wp-image-1802" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-10.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-10-300x232.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-10-768x595.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图9 设置root密码为root</figcaption></figure>



<p>8.设置系统环境为中文，在终端输入指令 dpkg-reconfigure locales，使用tab切换到语言选项界面，找到en_us.utf-8 使用【空格】将*号取消。如图10所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="774" height="706" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-11.png" alt="" class="wp-image-1803" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-11.png 774w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-11-300x274.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-11-768x701.png 768w" sizes="auto, (max-width: 774px) 100vw, 774px" /><figcaption class="wp-element-caption">图10</figcaption></figure>



<p>9.找到zh_cn.gbk 和zh_cn.utf-8，使用【空格】选中，如图11所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="499" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-12.png" alt="" class="wp-image-1804" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-12.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-12-300x173.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-12-768x443.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图11 增加中文文本选项</figcaption></figure>



<p>10.完成后，使用tab切换到&lt;ok&gt;，按【enter】，进入下一界面，使用↑↓找到【zh_cn.utf-8】按【enter】选中，如图12所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="478" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-13.png" alt="" class="wp-image-1805" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-13.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-13-300x166.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-13-768x424.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图12 配置中文界面</figcaption></figure>



<p>11.重启虚拟机，使配置生效</p>



<p>12.重启之后，在弹出的软件源更新提示选择【不更新】，同时为了配置方便，在文件夹命名选项选择【keep old names】。如图13所示</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="510" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-14.png" alt="" class="wp-image-1806" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-14.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-14-300x177.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-14-768x453.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图13 重启之后配置界面</figcaption></figure>



<p>13.先使用【apt update】更新源，然后使用【apt install vim】安装vim编辑器，这里使用默认源，如图14所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="479" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-15.png" alt="" class="wp-image-1807" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-15.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-15-300x166.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-15-768x425.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图14 更新默认源和安装vim编辑器</figcaption></figure>



<p>14.完成vim编辑器的安装后，进行换源操作：输入【vim /etc/apt/source.list】，编辑源文件，将原有源使用【#】注释或者删除全部内容，然后将阿里云源粘贴到文件中。</p>



<div class="wp-block-group"><div class="wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained">
<p>deb http://mirrors.aliyun.com/ubuntu/ xenial main restricted universe multiverse&nbsp;</p>



<p>deb http://mirrors.aliyun.com/ubuntu/ xenial-security main restricted universe multiverse&nbsp;</p>



<p>deb http://mirrors.aliyun.com/ubuntu/ xenial-updates main restricted universe multiverse&nbsp;</p>



<p>deb http://mirrors.aliyun.com/ubuntu/ xenial-backports main restricted universe multiverse&nbsp;</p>



<p>15.对于14步的操作，完成后的源文件内容如图15所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="306" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-16.png" alt="" class="wp-image-1808" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-16.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-16-300x106.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-16-768x272.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图15 对UbuntuKylin换阿里云源</figcaption></figure>



<p>16.使用【apt update】命令更新源，然后输入【apt install ssh】安装ssh服务，如图16所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="306" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-17.png" alt="" class="wp-image-1809" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-17.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-17-300x106.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-17-768x272.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图16 安装ssh服务</figcaption></figure>



<p>17.添加hadoop用户，使用【useradd -m hadoop -s /bin/bash】完成新建hadoop用户，使用【passwd hadoop】设置密码，使用【adduser hadoop sudo】将其添加到管理员组，命令操作过程如图17所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="758" height="219" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-18.png" alt="" class="wp-image-1810" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-18.png 758w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-18-300x87.png 300w" sizes="auto, (max-width: 758px) 100vw, 758px" /><figcaption class="wp-element-caption">图17 添加并配置hadoop用户</figcaption></figure>



<p>18.使用【su hadoop】或切换用户方式（推荐），切换到hadoop用户后，使用【ssh localhost】完成ssh初次启动，通过命令【cd ~/.ssh/】【ssh-keygen -t rsa】【cat ./id_rsa.pub &gt;&gt; ./authorized_keys】添加自动登录，如图18所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="577" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-19.png" alt="" class="wp-image-1811" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-19.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-19-300x200.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-19-768x512.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图18 配置ssh自动登录</figcaption></figure>



<p>19.通过【ifconfig】查看虚拟机IP地址。使用Xshell软件对其进行连接，配置界面如图19所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="375" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-20.png" alt="" class="wp-image-1812" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-20.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-20-300x130.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-20-768x333.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图19 通过xshell连接ubuntu</figcaption></figure>



<p>20.点击图19的【连接】按钮后，弹出用户名和密码，这里输入创建的hadoop用户及密码，如图20-21所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="789" height="406" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-21.png" alt="" class="wp-image-1813" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-21.png 789w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-21-300x154.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-21-768x395.png 768w" sizes="auto, (max-width: 789px) 100vw, 789px" /><figcaption class="wp-element-caption">图20 设置登录用户名</figcaption></figure>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="789" height="383" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-22.png" alt="" class="wp-image-1814" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-22.png 789w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-22-300x146.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-22-768x373.png 768w" sizes="auto, (max-width: 789px) 100vw, 789px" /><figcaption class="wp-element-caption">图21 设置登录密码</figcaption></figure>



<p>21.如果输入内容都正确，则此时会连接上Ubuntu，终端显示为hadoop用户，如图22所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="445" height="556" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-23.png" alt="" class="wp-image-1815" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-23.png 445w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-23-240x300.png 240w" sizes="auto, (max-width: 445px) 100vw, 445px" /><figcaption class="wp-element-caption">图22 成功通过xshell连接ssh</figcaption></figure>



<p>22.将图1中的【tar.gz】后缀复制到hadoop的家目录下，然后使用</p>



<ol class="wp-block-list">
<li>cd ~</li>



<li>tar -zxf hadoop-3.1.3.tar.gz -C /usr/local</li>



<li>cd /usr/local/</li>



<li>mv ./hadoop-3.1.3/ ./hadoop</li>



<li>chown -R hadoop ./hadoop</li>
</ol>
</div></div>



<p>命令解压并配置hadoop，如图23所示。</p>



<div class="wp-block-group"><div class="wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained">
<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="403" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-24.png" alt="" class="wp-image-1816" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-24.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-24-300x140.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-24-768x358.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图23 解压并配置hadoop</figcaption></figure>



<p>23.将目录切换到【/usr/lib】下，创建存放JDK的文件夹，返回主目录，解压JDK文件到存放JDK的文件夹下，命令组如下所示，操作过程如图24所示。</p>



<ol class="wp-block-list">
<li>cd /usr/lib</li>



<li>mkdir jvm </li>



<li>cd <em>/home/hadoop</em></li>



<li>tar -zxvf ./jdk-8u162-linux-x64.tar.gz -C /usr/lib/jvm</li>
</ol>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="360" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-26.png" alt="" class="wp-image-1818" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-26.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-26-300x125.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-26-768x320.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图24 创建存放jdk的文件目录并解压jdk</figcaption></figure>



<p>24.使用【vim ./.bashrc】编辑bashrc文件,在头部空行增加如下内容，完成结果如图25所示。</p>



<pre class="wp-block-code"><code>export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_162
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native</code></pre>



<figure class="wp-block-image size-full is-resized"><img loading="lazy" decoding="async" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-25.png" alt="" class="wp-image-1817" width="734" height="556" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-25.png 763w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-25-300x228.png 300w" sizes="auto, (max-width: 734px) 100vw, 734px" /><figcaption class="wp-element-caption">图25 配置环境变量</figcaption></figure>



<p>25.使用source ~/.bashrc 编译文件，然后通过输入【java -version】查看java版本，如果之前的步骤都正确进行，可以显示java版本号，如图26所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="750" height="110" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-27.png" alt="" class="wp-image-1819" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-27.png 750w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-27-300x44.png 300w" sizes="auto, (max-width: 750px) 100vw, 750px" /><figcaption class="wp-element-caption">图26 完成配置的JAVA环境</figcaption></figure>



<p>26.切换到hadoop目录，使用【./bin/hadoop version】查看hadoop版本，如果前面的步骤均正确配置，可以显示hadoop的版本号，如图27所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="127" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-28.png" alt="" class="wp-image-1820" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-28.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-28-300x44.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-28-768x113.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图27 hadoop版本号显示</figcaption></figure>



<p>27.通过运行grep实例，将 input 文件夹中的所有文件作为输入，筛选当中符合正则表达式 dfs[a-z.]+ 的单词并统计出现的次数，最后输出结果到 output 文件夹中，如图28所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="71" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-29.png" alt="" class="wp-image-1821" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-29.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-29-300x25.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-29-768x63.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图28 测试用例</figcaption></figure>



<p>28.执行成功后显示dfsadmin出现了一次，如图29所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="589" height="93" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-30.png" alt="" class="wp-image-1822" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-30.png 589w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-30-300x47.png 300w" sizes="auto, (max-width: 589px) 100vw, 589px" /><figcaption class="wp-element-caption">图29 测试用例完成后的显示结果</figcaption></figure>



<p>29.下面通过配置xml文件对Hadoop进行伪分布式配置</p>



<p>30.Hadoop 的配置文件位于 /usr/local/hadoop/etc/hadoop/ 中，伪分布式需要修改2个配置文件&nbsp;<strong>core-site.xml</strong>&nbsp;和&nbsp;<strong>hdfs-site.xml</strong>&nbsp;。Hadoop的配置文件是 xml 格式，每个配置以声明 property 的 name 和 value 的方式来实现。使用命令【vim ./etc/hadoop/core-site.xml】修改配置文件&nbsp;<strong>core-site.xml</strong>，修改结果如图30所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="242" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-31.png" alt="" class="wp-image-1823" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-31.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-31-300x84.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-31-768x215.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图30 修改后的core-site.xml文件</figcaption></figure>



<p>31.使用命令【vim ./etc/hadoop/hdfs-site.xml】修改文件，修改结果如图31所示。</p>



<figure class="wp-block-image size-full is-resized"><img loading="lazy" decoding="async" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-32.png" alt="" class="wp-image-1824" width="734" height="355" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-32.png 763w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-32-300x145.png 300w" sizes="auto, (max-width: 734px) 100vw, 734px" /><figcaption class="wp-element-caption">图31 修改后的core-hdfs.xml文件</figcaption></figure>



<p>32.执行NameNode格式化，如图32所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="318" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-33.png" alt="" class="wp-image-1825" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-33.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-33-300x110.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-33-768x282.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图32 使用命令进行namenode格式化</figcaption></figure>



<p>33.使用命令【./sbin/start-dfs.sh】开启namenode和datanode守护进程，结果如图33所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="59" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-34.png" alt="" class="wp-image-1826" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-34.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-34-300x20.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-34-768x52.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图33 开启namenode和datanode守护进程</figcaption></figure>



<p>34.可以通过浏览器访问【http://虚拟机地址:9870】查看 NameNode 和 Datanode 信息，还可以在线查看 HDFS 中的文件。如图34所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="482" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-35.png" alt="" class="wp-image-1827" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-35.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-35-300x167.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-35-768x428.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图34 通过浏览器查看namenode和datanode信息</figcaption></figure>



<p>35.要使用 HDFS，首先需要在 HDFS 中创建用户目录【./bin/hdfs dfs -mkdir -p /user/hadoop】，接着将 ./etc/hadoop 中的 xml 文件作为输入文件复制到分布式文件系统中，即将 /usr/local/hadoop/etc/hadoop 复制到分布式文件系统中的 /user/hadoop/input 中。我们使用的是 hadoop 用户，并且已创建相应的用户目录 /user/hadoop ，因此在命令中就可以使用相对路径如 input，其对应的绝对路径就是 /user/hadoop/input【./bin/hdfs dfs -mkdir input 】【./bin/hdfs dfs -mkdir input】。复制完成后，可以通过【./bin/hdfs dfs -ls input】命令查看文件列表，命令过程如图35所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="322" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-36.png" alt="" class="wp-image-1828" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-36.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-36-300x112.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-36-768x286.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图35 配置HDFS</figcaption></figure>



<p>36.通过执行hadoop伪分布式运行，对结果再次进行验证，命令如下所示，结果如图36所示。</p>



<p>./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar grep input output 'dfs[a-z.]+'</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="87" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-37.png" alt="" class="wp-image-1829" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-37.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-37-300x30.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-37-768x77.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图36 再次查看运行结果</figcaption></figure>



<p>37.将结果取回本地的./output目录，并使用cat命令进行显示，操作过程和结果如图37所示。</p>



<figure class="wp-block-image size-full is-resized"><img loading="lazy" decoding="async" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-38.png" alt="" class="wp-image-1830" width="743" height="81" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-38.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-38-300x33.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-38-768x84.png 768w" sizes="auto, (max-width: 743px) 100vw, 743px" /><figcaption class="wp-element-caption">图37 将结果取回本地目录</figcaption></figure>



<p>38.使用【./bin/hdfs dfs -rm -r output】&nbsp; 删除 output 文件夹，使用【./sbin/stop-dfs.sh】停止hadoop，使用【./sbin/start-dfs.sh】启动hadoop（要求在hadoop安装目录下），如图38所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="119" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-39.png" alt="" class="wp-image-1831" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-39.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-39-300x41.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-39-768x106.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图38 hadoop的停止和启动操作</figcaption></figure>



<p>39.此处开始是配置Eclipse，因为不算难，所以放到一起来写。</p>



<p>40.解压Eclipse软件，然后在刚刚创建的【/usr/lib/jvm/jdk1.8.0_162/jre/bin】目录下将jre复制到eclipse的jre中，双击eclipse齿轮图标，即可正常启动。如图39所示。</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="865" height="442" src="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-40.png" alt="" class="wp-image-1832" srcset="https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-40.png 865w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-40-300x153.png 300w, https://www.leexinghai.com/aic/wp-content/uploads/2023/02/image-40-768x392.png 768w" sizes="auto, (max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption">图39 配置并启动eclipse</figcaption></figure>



<p></p>
</div></div>



<p>环境配置可以提供技术支持和指导：</p>



<p>对于【计科2005】班：免费完全技术支持</p>



<p>对于【其他】班：免费有限技术支持或付费完全技术支持</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.leexinghai.com/aic/h30222-%e3%80%8a%e5%a4%a7%e6%95%b0%e6%8d%ae%e5%ba%94%e7%94%a8%e6%8a%80%e6%9c%af%e3%80%8b%e8%af%be%e7%a8%8b%e7%ae%80%e6%98%93%e7%8e%af%e5%a2%83%e9%85%8d%e7%bd%aevmware%e6%96%b9%e6%b3%95/feed/</wfw:commentRss>
			<slash:comments>4</slash:comments>
		
		
			</item>
	</channel>
</rss>
