首页
登录 | 注册

hive job 提示Invalid sync和 无法分配内存 报错处理-levy

近期发现分析部门同事告知,hive处理原始数据的时候总是不能执行完成,报错如下,这个问题是avro的文件不完整:
Diagnostic Messages for this Task:
Error: java.io.IOException: java.io.IOException: org.apache.avro.AvroRuntimeException: java.io.IOException: Invalid sync!
        at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
        at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
        at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:273)
        at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:183)
        at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:198)
        at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:184)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.io.IOException: org.apache.avro.AvroRuntimeException: java.io.IOException: Invalid sync!
        at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
        at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
        at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:352)
        at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101)
        at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)
        at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:115)
        at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:271)
        ... 11 more
Caused by: org.apache.avro.AvroRuntimeException: java.io.IOException: Invalid sync!
        at org.apache.avro.file.DataFileStream.hasNext(DataFileStream.java:210)
        at org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.next(AvroGenericRecordReader.java:149)
        at org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.next(AvroGenericRecordReader.java:52)
        at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:347)
        ... 15 more
Caused by: java.io.IOException: Invalid sync!
        at org.apache.avro.file.DataFileStream.nextRawBlock(DataFileStream.java:293)
        at org.apache.avro.file.DataFileStream.hasNext(DataFileStream.java:198)
        ... 18 more



        
        
        
        
查看近期执行失败的job日志,发现提示服务器内存不足
Log Type: syslog
Log Length: 18946

2015-12-27 13:30:44,516 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2015-12-27 13:30:44,540 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSinkAdapter: Sink ganglia started
2015-12-27 13:30:44,601 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2015-12-27 13:30:44,601 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system started
2015-12-27 13:30:44,609 INFO [main] org.apache.hadoop.mapred.YarnChild: Executing with tokens:
2015-12-27 13:30:44,609 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job, Service: job_1451036614992_0057, Ident: ([email protected]b3f4c)
2015-12-27 13:30:44,670 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now.
2015-12-27 13:30:44,907 INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir for child: /diskb/hadoop/yarn/local/usercache/hdfs/appcache/application_1451036614992_0057,/diskc/hadoop/yarn/local/usercache/hdfs/appcache/application_1451036614992_0057
2015-12-27 13:30:45,345 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
2015-12-27 13:30:45,669 INFO [main] org.apache.hadoop.mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
2015-12-27 13:30:46,003 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: hdfs://BeiJing/data/raw/click/2015122710/http-topic.avro.192.168.2.12.avro:1342177280+47143758
2015-12-27 13:30:46,223 INFO [main] org.apache.hadoop.mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer

Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00000007efc80000, 272105472, 0) failed; error='无法分配内存' (errno=12)

分析是因为hadoop节点上有nodemanager和supervisor同时存在,且worker数量较多,跑任务的时候占用内存较大,所以导致内存不足,
将16个worker的数量减少为10个,重启下storm服务,有时候重启服务worker数量还是维持原来的那么多,所以直接到节点上删除所有的worker,之后再启动supervisor,就好。

观察一段时间发现hive job不会出现失败的情况了,感觉是因为原始数据的job在处理avro数据的时候,因为节点内存的问题,导致写入到HDFS的时候部分avro文件不完整,所以hive处理的时候会报错。

本文网址:http://www.bnee.net/article/51773.html

相关文章

  • 一.kmalloc分配内存         GFP_ATOMIC:用于在中断处理例程或其他运行于进程上下文之外的代码中分配内存,不会休眠.         GFP_KERNEL:内核内存的通常分配方法,可能引起休眠.         kma ...
  • #include  void *kmalloc(size_t size, int flags); kmalloc 这个函数快(除非它阻塞)并且不清零它获得的内存; 分配的区仍然持有它原来的内容.分配的区也是在物理内存中连续kmalloc 能 ...
  • 内存分配的原理__进程分配内存有两种方式,分别由两个系统调用完成:brk和mmap系统调用
    如何查看进程发生缺页中断的次数?          用ps -o majflt,minflt -C program命令查看.           majflt代表major fault,中文名叫大错误,minflt代表minor fault ...
  • 内核为设备驱动提供了一个统一的内存管理接口,所以模块无需涉及分段和分页等问题. 我已经在第一个scull模块中使用了 kmalloc 和 kfree 来分配和释放内存空间. kmalloc 函数内幕 kmalloc 是一个功能强大且高速(除 ...
  • linux设备驱动归纳总结(五):1.在内核空间分配内存
    linux设备驱动归纳总结(五):1.在内核空间分配内存 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 一般的, ...
  • keil下载程序提示“Invalid ROM Table”的解决办法
    问题: 项目需要将STM32F407的程序移植到STM32F427IIT上,其外接晶振由原来的25M换成8M,其对应时钟频率也跟随调整,但实测频率一直上不去,为验证硬件通路,在STM32F427IIT上贴25M晶振,下载程序出现下图1提示: ...
  • 数据倾斜通常指hive根据key值hash分发到各个节点,相同的key值会分发到一个执行节点中,由于某些key值对应的数据量比其它key值的数据量大很多,导致某些执行节点的运行时间远大于其它节点,从而导致整个job执行时间较长.在hive中 ...

2020 bnee小站 webmaster#bnee.net
12 q. 0.079 s.
湘ICP备19013596号-2