Hadoop의 HDFS로 MovieLens 데이터셋을 업로드할 예정.
브라우저에서 localhost:4200 maria_dev/maria_dev 로 접속
[maria_dev@sandbox-hdp ~]$ hadoop fs -ls
Found 2 items
drwxr-xr-x - maria_dev hdfs 0 2022-05-28 04:03 .Trash
drwxr-xr-x - maria_dev hdfs 0 2022-05-19 11:05 hive
[maria_dev@sandbox-hdp ~]$ wget media.sundog-soft.com/hadoop/ml-100k/u.data
--2022-05-28 06:17:19--<http://media.sundog-soft.com/hadoop/ml-100k/u.data>
Resolving media.sundog-soft.com (media.sundog-soft.com)... 52.216.229.0
Connecting to media.sundog-soft.com (media.sundog-soft.com)|52.216.229.0|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2079229 (2.0M) [application/octet-stream]
Saving to: ‘u.data’
100%[=================================================================================================>] 2,079,229 1.47MB/s in 1.3s
2022-05-28 06:17:21 (1.47 MB/s) - ‘u.data’ saved [2079229/2079229]
[maria_dev@sandbox-hdp ~]$ hadoop fs -mkdir ml-100k
[maria_dev@sandbox-hdp ~]$ hadoop fs -ls
Found 3 items
drwxr-xr-x - maria_dev hdfs 0 2022-05-28 04:03 .Trash
drwxr-xr-x - maria_dev hdfs 0 2022-05-19 11:05 hive
drwxr-xr-x - maria_dev hdfs 0 2022-05-28 06:14 ml-100k
[maria_dev@sandbox-hdp ~]$ hadoop fs -copyFromLocal ./u.data ml-100k/u.data
[maria_dev@sandbox-hdp ~]$ hadoop fs -ls ./ml-100k
Found 1 items
-rw-r--r-- 1 maria_dev hdfs 2079229 2022-05-28 06:18 ml-100k/u.data
[maria_dev@sandbox-hdp ~]$ hadoop fs -rm ml-100k/u.data
22/05/28 06:19:31 INFO fs.TrashPolicyDefault: Moved: 'hdfs://sandbox-hdp.hortonworks.com:8020/user/maria_dev/ml-100k/u.data' to trash at: h
dfs://sandbox-hdp.hortonworks.com:8020/user/maria_dev/.Trash/Current/user/maria_dev/ml-100k/u.data
[maria_dev@sandbox-hdp ~]$ h^C
[maria_dev@sandbox-hdp ~]$ hadoop fs -ls ml-100k/
[maria_dev@sandbox-hdp ~]$ hadoop fs -ls
Found 3 items
drwxr-xr-x - maria_dev hdfs 0 2022-05-28 06:19 .Trash
drwxr-xr-x - maria_dev hdfs 0 2022-05-19 11:05 hive
drwxr-xr-x - maria_dev hdfs 0 2022-05-28 06:19 ml-100k
[maria_dev@sandbox-hdp ~]$ hadoop fs -rmdir ml-100k
[maria_dev@sandbox-hdp ~]$ hadoop fs -ls
Found 2 items
drwxr-xr-x - maria_dev hdfs 0 2022-05-28 06:19 .Trash
drwxr-xr-x - maria_dev hdfs 0 2022-05-19 11:05 hive
주기적으로 동작하거나, 스크립트를 짜야 하는 경우엔 CLI가 유리하다.
애초에 CLI가 편하긴 하지만.