Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue with the "test.py" #1

Open
eisneim opened this issue Jun 7, 2017 · 10 comments
Open

issue with the "test.py" #1

eisneim opened this issue Jun 7, 2017 · 10 comments

Comments

@eisneim
Copy link

eisneim commented Jun 7, 2017

Hi @Deeperjia , first, this is a great project, it's really help, thank you for your great work.

there is one issue: in test.py SpeechLoader initialized without label_file

speech_loader = SpeechLoader()

will cause utils.py complain about no file to decode

    self.preprocess(wav_path, label_file, wavs_file, vocab_file, mfcc_tensor, label_tensor)
  File ".../utils.py", line 54, in preprocess
    with codecs.open(label_file,"r", encoding=self.encoding) as f:
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/codecs.py", line 895, in open
    file = builtins.open(filename, mode, buffering)
TypeError: expected str, bytes or os.PathLike object, not NoneType

and if add label_file path, without wav_path for SpeechLoader()

Traceback (most recent call last):
  File "test.py", line 60, in <module>
    speech_to_text()
  File "test.py", line 19, in speech_to_text
    speech_loader = SpeechLoader(label_file=label_file)
  File "...../utils.py", line 34, in __init__
    self.preprocess(wav_path, label_file, wavs_file, vocab_file, mfcc_tensor, label_tensor)
  File "..../utils.py", line 88, in preprocess
    self.wav_max_len = max(len(mfcc) for mfcc in self.mfcc_tensor)
ValueError: max() arg is an empty sequence
@eisneim
Copy link
Author

eisneim commented Jun 7, 2017

after some experiments, i got unusable results even after 100 epochs

latest ckpt: /Users/eisneim/www/deepLearning/deeperjia_tensorflow-wavenet_cn/model/speech.module-10
==> restore model costs: 8.079950094223022s
---------------------------
Input: /Volumes/raid/_deeplearning/THCHS30语音/wav/test/D4/D4_750.wav
Output: 苏美军的一下爱国想市马债山一动党军乎苏名外断全没等也奋体逃战
---------------------------
Input: /Volumes/raid/_deeplearning/THCHS30语音/wav/test/D4/D4_751.wav
Output: 王英看北香边后不分云面三产气来几似为组终为抓果
---------------------------
Input: /Volumes/raid/_deeplearning/THCHS30语音/wav/test/D4/D4_752.wav
Output: 他们种大斯喜路一家茶顺不里按就说药软义鱼语她增把了又被扎来的频繁
---------------------------
Input: /Volumes/raid/_deeplearning/THCHS30语音/wav/test/D4/D4_753.wav
Output: 几百来没纹书经少在订小学钉响也将于些江外把位军享就处昏线的人
---------------------------
Input: /Volumes/raid/_deeplearning/THCHS30语音/wav/test/D4/D4_754.wav
Output: 待得肉丝个可等柳根怎疼团初姆毛有王恶末奥会次原是丰飘饭伞感问秋区表事韦启好种行机细高合许期底长登算港民名根菱银据同知秋顺半百人窃用设的释皮音身高了哑不迪请急比屏七八西水市尼场胜倒崔造厌交腔国岸工你本出多放亭的着员引已锐给不业谊密服混名年议腰遵耀畜猛马偏墙筒顺级既从嘉贤从偏结难软然性

just wondering what's the result on your side? it would be really useful if a evaluation script is provide.

@Deeperjia
Copy link
Owner

Thanks for your focus. The result I got is nearly same as yours. Since I did not debug parameters carefully

@dllen
Copy link

dllen commented Sep 30, 2017

@eisneim 我也在测试的时候遇到了问题,能否把你的test.py发出来参考一下.谢谢!

@arixlin
Copy link

arixlin commented Nov 13, 2017

@dllen you can loaddata:
speech_loader = SpeechLoader(wav_path='data/wav/train', label_file='data/doc/trans/train.word.txt', n_mfcc=60)
but no found key:
Not found: Key conv1d10/variance/Adam not found in checkpoint

@arixlin
Copy link

arixlin commented Nov 13, 2017

@dllen https://github.com/arixlin/tensorflow-wavenet 我修复了一些BUG 你可以参考我的

@finebck
Copy link

finebck commented Jan 15, 2018

@arixlin 感谢,参考了你的代码,跑了一下,我还没调参感觉效果不怎么好,你后期的效果好吗??

@arixlin
Copy link

arixlin commented Jan 15, 2018

@finebck 这只是wave的 如果你要做语音识别的话, 还要结合HMM 或LSTM 做NLP

@finebck
Copy link

finebck commented Jan 15, 2018

@arixlin 也就是还要完善语言模型?,您对这块有什么建议吗?

@czifan
Copy link

czifan commented Feb 18, 2018

你好 我用这个代码训练出来的模型预测总是为空 之后我采用另外的一个框架重写代码训练出来也是如此 想问下这个模型是有这个特点还是我哪里弄错了~

@NPCv7
Copy link

NPCv7 commented Mar 4, 2024

你好 我用这个代码训练出来的模型预测总是为空 之后我采用另外的一个框架重写代码训练出来也是如此 想问下这个模型是有这个特点还是我哪里弄错了~

朋友你最后咋解决的

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants