File not found error at step 2 in yarn logs

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

File not found error at step 2 in yarn logs

Gavin_Chou
Hi, all:
        I have a problem while building cube at step 2.

        The error appears in yarn log:

2017-06-14 11:21:08,793 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1497364689294_0018 transitioned from NEW to INITING
2017-06-14 11:21:08,793 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Adding container_1497364689294_0018_01_000001 to application application_1497364689294_0018
2017-06-14 11:21:08,793 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1497364689294_0018 transitioned from INITING to RUNNING
2017-06-14 11:21:08,794 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1497364689294_0018_01_000001 transitioned from NEW to LOCALIZING
2017-06-14 11:21:08,794 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_INIT for appId application_1497364689294_0018
2017-06-14 11:21:08,794 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource file:/tmp/hadoop-yarn/staging/hadoop/.staging/job_1497364689294_0018/job.jar transitioned from INIT to DOWNLOADING
2017-06-14 11:21:08,794 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource file:/tmp/hadoop-yarn/staging/hadoop/.staging/job_1497364689294_0018/job.splitmetainfo transitioned from INIT to DOWNLOADING
2017-06-14 11:21:08,794 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource file:/tmp/hadoop-yarn/staging/hadoop/.staging/job_1497364689294_0018/job.split transitioned from INIT to DOWNLOADING
2017-06-14 11:21:08,794 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource file:/tmp/hadoop-yarn/staging/hadoop/.staging/job_1497364689294_0018/job.xml transitioned from INIT to DOWNLOADING
2017-06-14 11:21:08,794 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource file:/home/q/hadoop/kylin/tomcat/temp/kylin_job_meta3892468167792432608/meta transitioned from INIT to DOWNLOADING
2017-06-14 11:21:08,794 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Created localizer for container_1497364689294_0018_01_000001
2017-06-14 11:21:08,794 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Downloading public rsrc:{ file:/home/q/hadoop/kylin/tomcat/temp/kylin_job_meta3892468167792432608/meta, 1497410467000, FILE, null }
2017-06-14 11:21:08,796 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Writing credentials to the nmPrivate file /home/q/hadoop/hadoop/tmp/nm-local-dir/nmPrivate/container_1497364689294_0018_01_000001.tokens. Credentials list:
2017-06-14 11:21:08,796 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download rsrc { { file:/home/q/hadoop/kylin/tomcat/temp/kylin_job_meta3892468167792432608/meta, 1497410467000, FILE, null },pending,[(container_1497364689294_0018_01_000001)],781495827608056,DOWNLOADING}
java.io.FileNotFoundException: File file:/home/q/hadoop/kylin/tomcat/temp/kylin_job_meta3892468167792432608/meta does not exist
        at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:524)
        at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:737)
        at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:514)
        at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397)
        at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:250)
        at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:353)
        at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:59)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
2017-06-14 11:21:08,796 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Initializing user hadoop
2017-06-14 11:21:08,797 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource file:/home/q/hadoop/kylin/tomcat/temp/kylin_job_meta3892468167792432608/meta(->/home/q/hadoop/hadoop/tmp/nm-local-dir/filecache/18/meta) transitioned from DOWNLOADING to FAILED
2017-06-14 11:21:08,797 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1497364689294_0018_01_000001 transitioned from LOCALIZING to LOCALIZATION_FAILED
2017-06-14 11:21:08,797 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl: Container container_1497364689294_0018_01_000001 sent RELEASE event on a resource request { file:/home/q/hadoop/kylin/tomcat/temp/kylin_job_meta3892468167792432608/meta, 1497410467000, FILE, null } not present in cache.
2017-06-14 11:21:08,797 WARN org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=hadoop OPERATION=Container Finished - Failed TARGET=ContainerImpl RESULT=FAILURE DESCRIPTION=Container failed with state: LOCALIZATION_FAILED APPID=application_1497364689294_0018 CONTAINERID=container_1497364689294_0018_01_000001
2017-06-14 11:21:08,797 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1497364689294_0018_01_000001 transitioned from LOCALIZATION_FAILED to DONE

        This error appears in yarn-nodemanager log of machine B and D. And before it I found a warning log in yarn-nodemanager log in machine C (Kylin is only installed in machine A):

2017-06-14 11:21:01,131 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1497364689294_0017_01_000002 transitioned from LOCALIZING to LOCALIZED
2017-06-14 11:21:01,146 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1497364689294_0017_01_000002 transitioned from LOCALIZED to RUNNING
2017-06-14 11:21:01,146 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Neither virutal-memory nor physical-memory monitoring is needed. Not running the monitor-thread
2017-06-14 11:21:01,149 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: launchContainer: [nice, -n, 0, bash, /home/q/hadoop/hadoop/tmp/nm-local-dir/usercache/hadoop/appcache/application_1497364689294_0017/container_1497364689294_0017_01_000002/default_container_executor.sh]
2017-06-14 11:21:05,024 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Stopping container with container Id: container_1497364689294_0017_01_000002
2017-06-14 11:21:05,025 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=hadoop IP=10.90.181.160 OPERATION=Stop Container Request TARGET=ContainerManageImpl RESULT=SUCCESS APPID=application_1497364689294_0017 CONTAINERID=container_1497364689294_0017_01_000002
2017-06-14 11:21:05,025 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1497364689294_0017_01_000002 transitioned from RUNNING to KILLING
2017-06-14 11:21:05,025 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Cleaning up container container_1497364689294_0017_01_000002
2017-06-14 11:21:05,028 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exit code from container container_1497364689294_0017_01_000002 is : 143
2017-06-14 11:21:05,040 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1497364689294_0017_01_000002 transitioned from KILLING to CONTAINER_CLEANEDUP_AFTER_KILL
2017-06-14 11:21:05,041 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=hadoop OPERATION=Container Finished - Killed TARGET=ContainerImpl RESULT=SUCCESS APPID=application_1497364689294_0017 CONTAINERID=container_1497364689294_0017_01_000002
2017-06-14 11:21:05,041 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1497364689294_0017_01_000002 transitioned from CONTAINER_CLEANEDUP_AFTER_KILL to DONE

        It puzzles me that why kylin wants to load a local file by applications on other nodes in step 2? How can I solve it?

        Here are some additional information(They may be helpful for analyzing the problem):
                The cluster has 4 machines: A B C and D.
                Hadoop version 2.5.0  support snappy      
                              Namenode: A(stand by) B(active)
                              Datanode: all
                Hive version 0.13.1 recompile for hadoop2
                HBase version 0.98.6 recompile for hadoop 2.5.0
                             Master: A(active) and B
                When I set “hbase.rootdir” in hbase-site.xml as detail IP address of active namenode, the step 2 is ok, but it will failed at the last 5 step.
                So I change the setting item to cluster name. And there is no problem in hbase logs.

Thank you

Best regards  




Loading...