• Hi Detlev,

    In this case it is not understand the speech. It gives attention how to predict the frequency shift. The frequency prediction range is -300 to +300 Hz. Now I have train it with English voices and one of the tests it will be to test it with other languages. Also the time that give it to the model is now 0.1 sec which is too small to understand speech. Generally speaking it tries to predict what it is trained to do. In my case it is trained to predict the frequency shift.

  • Hi,

    A new model is implemented trained with variable levels of noise added to signals. It requires much more training time (9 hours vs 20 minutes) but with better results. It is better suited for audio and video. The produced model has now 25 Hz of mean error and 1376 of squared error. It must mentioned that data used for testing performance is not used for training.

  • Hi,

    does the artificial intelligence understand the speech?


    Hi Detlev,

    I come back to your question. In general AI understanding the speech. There exist python modules that you can feed with voice in various languages and return in text the content. But probably these will fail decoding if the voice is shifted even and a few Hz. A new training for recognising voice that is shifted a few Hz is required. The results maybe will be more poor than the existing. Of course if you get text, translation can be implemented and cross language QSO become a new reality in ham radio communications. One more idea I had during development of the current solution is a voice activated squelch that will recognise human voice and open audio. In general with AI it seems that new applications can implemented that it was not possible to do until now.

  • Hi,

    This is a : dropbox link

    A demo working package is build to demonstrate SSB offset frequency finding. This is not a practical application for everyday usage but a demo one that can be used without special hardware. I personally use WSL Ubuntu 22.04.

    There are 3 files, two Python3 programs and the AI model file. The model file is not permited to used in a commercial application. To use the programs the user must make a websdr recording from QO100 websdr web interface. The system can find frequency shifts from -300 Hz to +300 Hz. In order not to loose audio information the bandwidth can be increased. Before use the be sure that you have do pip3 install numpy librosa tensorflow (if pip3 does not exist load it as a package). Then you can find the frequency shift by calling python3 websdr-recording-xxx.wav .

    If all is OK the signed mean frequency will displayed. Then we can convert the downloaded file calling the gnuradio python3 converter (be sure that you have gnuradio installed). The syntax is for example:

    python3 -i websdr-recording-xxx.wav -o outfile.wav -f 177

    Please note that 177 is an example of frequency shift that the program do. It must be same value but opposite sign of the mean value founded before. The outfile.wav can be played in order to determine if is on proper frequency.

    George SV1BDS

  • Hi,

    I noticed that 0.1 sec frequency evaluations follow normal distribution (as normally expected). For every file that I test there are some decades of 0.1 sec samples. The mean error of 25 Hz found until now is computed at each 0.1 sec sample. I compute the mean value for each voice file and then measure the difference from the real frequency. In this way there are more realistic results that they did not depend from the distribution of the error but only from the mean value of the all the samples. The new mean error is found to be 12.3 Hz only.