语音识别服务funasr搭建
<p>本文讨论语音识别功能,使用的是阿里的开源语音识别项目FunASR,含两种部署方式,社区windows版和docker容器化部署,windows社区版的可以用于本地开发使用,生产环境建议使用容器版。</p><h2>1、windows社区版部署</h2>
<h3> 1.1、环境安装</h3>
<p> 软件需要Visual Studio 2022 c++环境,如果没有Visual Studio 2022 c++运行环境,双击 VC_redist.x64(2022).exe 安装 Visual Studio 2022环境下编译的C++程序运行所需要的库。</p>
<h3> 1.2、下载windows社区软件包<br></h3>
<p> https://www.modelscope.cn/models/iic/funasr-runtime-win-cpu-x64/files</p>
<p> <img alt="image" width="1253" height="478" loading="lazy" data-src="https://img2024.cnblogs.com/blog/1607557/202512/1607557-20251226091034169-345184089.png" ></p>
<p> 随便选个版本的下载,这里选择的是0.2.0版本</p>
<h3> 1.3、下载所需模型</h3>
git clone https://www.modelscope.cn/damo/speech_fsmn_vad_zh-cn-16k-common-onnx.git;
git clone https://www.modelscope.cn/damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx.git;
git clone https://www.modelscope.cn/damo/speech_ngram_lm_zh-cn-ai-wesp-fst.git;
git clone https://www.modelscope.cn/damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx.git;
git clone https://www.modelscope.cn/thuduj12/fst_itn_zh.git
<h3> 1.4、启动服务</h3>
<p> 将上面下载的windows社区软件包解压后,打开powershell,进入到解压后的目录,执行下面的命令</p>
./funasr-wss-server.exe
--vad-dir D:/developTest/funasr-runtime-resources/models/speech_fsmn_vad_zh-cn-16k-common-onnx
--model-dir D:/developTest/funasr-runtime-resources/models/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx
--lm-dir D:/developTest/funasr-runtime-resources/models/speech_ngram_lm_zh-cn-ai-wesp-fst
--punc-dir D:/developTest/funasr-runtime-resources/models/punc_ct-transformer_cn-en-common-vocab471067-large-onnx
--itn-dir D:/developTest/funasr-runtime-resources/models/fst_itn_zh
--certfile 0
<p> 参数说明:</p>
--model-dirmodelscope model ID 或者 本地模型路径
--vad-dirmodelscope model ID 或者 本地模型路径
--punc-dirmodelscope model ID 或者 本地模型路径
--lm-dir modelscope model ID 或者 本地模型路径
--itn-dir modelscope model ID 或者 本地模型路径
--certfilessl的证书文件,如果需要关闭ssl,参数设置为0
<h3> 1.5、客户端调用</h3>
<p> 在windows社区版的解压目录下有客户端执行文件funasr-wss-client.exe</p>
./funasr-wss-client.exe --server-ip 127.0.0.1 --port 10095 --wav-path asr_example_zh.wav
<p> 服务默认端口是10095,--wav-path指定音频文件地址</p>
<h2>2、docker容器化部署</h2>
<h3> 2.1、拉取docker镜像</h3>
docker pull registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-0.4.6
<h3> 2.2、启动容器</h3>
<p> 在宿主机创建模型目录放置模型,这里的模型建议手动下载,就用上面的git下载下来,如果使用启动命令自动下载会很慢很卡。</p>
docker run -p 10095:10095 -it --privileged=true -v D:\developTest\funasr-runtime-resources\models:/workspace/models registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-0.4.6
<p> 映射容器端口,挂载之前创建的存放模型目录到容器内部。</p>
<h3> 2.3、启动服务</h3>
<p> 进入容器内部,进到FunASR/runtime目录下</p>
<p> <img alt="image" loading="lazy" data-src="https://img2024.cnblogs.com/blog/1607557/202512/1607557-20251226092232599-408304639.png" ></p>
<p> 执行如下命令启动服务</p>
nohup bash run_server.sh \
--certfile 0 \
--vad-dir speech_fsmn_vad_zh-cn-16k-common-onnx \
--model-dir speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx \
--punc-dir punc_ct-transformer_cn-en-common-vocab471067-large-onnx\
--lm-dir speech_ngram_lm_zh-cn-ai-wesp-fst \
--itn-dir fst_itn_zh> log.txt 2>&1 &
<p> 指定模型目录,这里的模型都是事先下载好的,就不需要通过启动命令下载了,certfile设为0,表示关闭ssl。</p>
<h2>3、调用示例</h2>
<p> 这里大致写了两种java调用方式一种是通过ProcessBuilder,一种是WebSocketClient,大家可以用来看看。</p>
<ul >
<li id="u0a0bed12" data-lake-index-type="0">使用ProcessBuilder,运行上面的客户端执行命令,获取执行结果</li>
</ul>
<img id="code_img_closed_ee8da724-9620-48b4-9e11-58304394a99c"data-src="http://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif"><img id="code_img_opened_ee8da724-9620-48b4-9e11-58304394a99c" data-src="http://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif">
public String localTranslation(MultipartRequest multipartRequest) {
StringBuffer resultBuffer = new StringBuffer();
// 需要传递给exe程序的参数
String exePath = "D:\\developTest\\funasr-runtime-resources\\funasr-runtime-win-cpu-x64\\funasr-runtime-win-cpu-x64-v0.2.0\\funasr-wss-client.exe";
String serveIp = "127.0.0.1"; // 假设你想要设置的IP地址
String port = "10095";
File targetFile = null;
try {
MultipartFile mFile = multipartRequest.getFile("file");
File dir = new File("D:\\developTest\\funasr-runtime-resources\\wav");
if (!dir.exists()) {
dir.mkdirs();
}
targetFile = File.createTempFile("tmp_", ".wav", dir);
mFile.transferTo(targetFile);
String wavPath = "D:\\developTest\\funasr-runtime-resources\\wav\\"+ targetFile.getName();
String[] cmd = new String[]{exePath, "--server-ip", serveIp, "--port", port, "--wav-path", wavPath};
ProcessBuilder pb = new ProcessBuilder();
pb.command(cmd);
Process process = pb.start();
//超时时间
int timeoutSeconds = 30;//超时30秒自动断开
//创建单线程线程池
ExecutorService executor = Executors.newSingleThreadExecutor();
Future<?> future = executor.submit(() -> {
try {
pb.redirectErrorStream(true);
// 读取外部程序的输出
BufferedReader reader = new BufferedReader(new InputStreamReader(process.getInputStream()));
String line;
while ((line = reader.readLine()) != null) {
System.out.println(line);
resultBuffer.append(line);
}
// 处理错误输出
BufferedReader errorReader = new BufferedReader(new InputStreamReader(process.getErrorStream()));
while ((line = errorReader.readLine()) != null) {
System.out.println(line);
if(line.contains("on_message")){
String[] array = line.split("on_message =");
resultBuffer.append(array);
}
}
// 等待程序执行完成
process.waitFor();
} catch (Exception e) {
e.printStackTrace();
}finally {
if(process.isAlive()){
process.destroy();
}
}
});
try {
// 等待进程完成或超时
future.get(timeoutSeconds, TimeUnit.SECONDS);
System.out.println("进程在规定时间内完成。");
} catch (Exception e) {
System.out.println("超时预警: 进程可能挂起。");
resultBuffer.append("timeout");
} finally {
//关闭连接
if(process.isAlive()){
process.destroy();
}
executor.shutdownNow(); // 取消任务并关闭线程池
}
} catch (Exception e) {
e.printStackTrace();
resultBuffer.append("error");
}finally {
if (targetFile.exists()) {
targetFile.delete();
}
}
System.out.println(resultBuffer.toString());
return resultBuffer.toString();
}
View Code
<ul >
<li id="u199a6f15" data-lake-index-type="0">使用WebSocketClient直接调用FunASR服务</li>
</ul>
<p> Client工具类</p>
<img id="code_img_closed_b0c7855c-8229-4a70-a43a-7df98e723287"data-src="http://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif"><img id="code_img_opened_b0c7855c-8229-4a70-a43a-7df98e723287" data-src="http://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif">
package com.example.demo1.web;
import java.io.*;
import java.net.URI;
import java.util.Map;
import org.java_websocket.client.WebSocketClient;
import org.java_websocket.drafts.Draft;
import org.java_websocket.handshake.ServerHandshake;
import org.json.simple.JSONArray;
import org.json.simple.JSONObject;
import org.json.simple.parser.JSONParser;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class FunasrWsClient extends WebSocketClient {
private static final Logger logger = LoggerFactory.getLogger(FunasrWsClient.class);
private boolean iseof = false;
private static String wavPath;
private static String mode = "offline";
private static String strChunkSize = "5,10,5";
private static int chunkInterval = 10;
private static int sendChunkSize = 1920;
private static String hotwords="";
private static String fsthotwords="";
private String wavName = "javatest";
private MyCallBack callBack;
public FunasrWsClient(URI serverUri,MyCallBack callBack) {
super(serverUri);
this.callBack = callBack;
}
public FunasrWsClient(URI serverUri,String wavPath,MyCallBack callBack) {
super(serverUri);
this.callBack = callBack;
this.wavPath = wavPath;
}
public FunasrWsClient(URI serverUri,String strChunkSize,int chunkInterval,String mode,String hotwords,String wavPath,MyCallBack callBack) {
super(serverUri);
this.callBack = callBack;
this.strChunkSize = strChunkSize;
this.chunkInterval = chunkInterval;
this.mode = mode;
this.fsthotwords = hotwords;
this.wavPath = wavPath;
int RATE = 16000;
String[] chunkList = strChunkSize.split(",");
int int_chunk_size = 60 * Integer.valueOf(chunkList.trim()) / chunkInterval;
int CHUNK = Integer.valueOf(RATE / 1000 * int_chunk_size);
this.sendChunkSize = CHUNK * 2;
}
public class RecWavThread extends Thread {
private FunasrWsClient funasrClient;
public RecWavThread(FunasrWsClient funasrClient) {
this.funasrClient = funasrClient;
}
public void run() {
this.funasrClient.recWav();
}
}
public FunasrWsClient(URI serverUri, Draft draft) {
super(serverUri, draft);
}
public FunasrWsClient(URI serverURI) {
super(serverURI);
}
public FunasrWsClient(URI serverUri, Map<String, String> httpHeaders) {
super(serverUri, httpHeaders);
}
public void getSslContext(String keyfile, String certfile) {
// TODO
return;
}
public void sendJson(
String mode, String strChunkSize, int chunkInterval, String wavName, boolean isSpeaking,String suffix) {
try {
JSONObject obj = new JSONObject();
obj.put("mode", mode);
JSONArray array = new JSONArray();
String[] chunkList = strChunkSize.split(",");
for (int i = 0; i < chunkList.length; i++) {
array.add(Integer.valueOf(chunkList.trim()));
}
obj.put("chunk_size", array);
obj.put("chunk_interval", new Integer(chunkInterval));
obj.put("wav_name", wavName);
if(FunasrWsClient.hotwords.trim().length()>0)
{
String regex = "\\d+";
JSONObject jsonitems = new JSONObject();
String[] items=FunasrWsClient.hotwords.trim().split(" ");
Pattern pattern = Pattern.compile(regex);
String tmpWords="";
for(int i=0;i<items.length;i++)
{
Matcher matcher = pattern.matcher(items);
if (matcher.matches()) {
jsonitems.put(tmpWords.trim(), items.trim());
tmpWords="";
continue;
}
tmpWords=tmpWords+items+" ";
}
obj.put("hotwords", jsonitems.toString());
}
if(suffix.equals("wav")){
suffix="pcm";
}
obj.put("wav_format", suffix);
if (isSpeaking) {
obj.put("is_speaking", new Boolean(true));
} else {
obj.put("is_speaking", new Boolean(false));
}
logger.info("sendJson: " + obj);
// return;
send(obj.toString());
return;
} catch (Exception e) {
e.printStackTrace();
}
}
public void sendEof() {
try {
JSONObject obj = new JSONObject();
obj.put("is_speaking", new Boolean(false));
logger.info("sendEof: " + obj);
// return;
send(obj.toString());
iseof = true;
return;
} catch (Exception e) {
e.printStackTrace();
}
}
public void recWav() {
String fileName=FunasrWsClient.wavPath;
String suffix=fileName.split("\\.");
sendJson(mode, strChunkSize, chunkInterval, wavName, true,suffix);
File file = new File(FunasrWsClient.wavPath);
int chunkSize = sendChunkSize;
byte[] bytes = new byte;
int readSize = 0;
try (FileInputStream fis = new FileInputStream(file)) {
if (FunasrWsClient.wavPath.endsWith(".wav")) {
fis.read(bytes, 0, 44);
}
readSize = fis.read(bytes, 0, chunkSize);
while (readSize > 0) {
if (readSize == chunkSize) {
send(bytes);
} else {
byte[] tmpBytes = new byte;
for (int i = 0; i < readSize; i++) {
tmpBytes = bytes;
}
send(tmpBytes);
}
if (!mode.equals("offline")) {
Thread.sleep(Integer.valueOf(chunkSize / 32));
}
readSize = fis.read(bytes, 0, chunkSize);
}
if (!mode.equals("offline")) {
Thread.sleep(2000);
sendEof();
Thread.sleep(3000);
close();
} else {
sendEof();
}
} catch (Exception e) {
e.printStackTrace();
}
}
@Override
public void onOpen(ServerHandshake handshakedata) {
RecWavThread thread = new RecWavThread(this);
thread.start();
}
@Override
public void onMessage(String message) {
JSONObject jsonObject = new JSONObject();
JSONParser jsonParser = new JSONParser();
logger.info("received: " + message);
try {
jsonObject = (JSONObject) jsonParser.parse(message);
logger.info("text: " + jsonObject.get("text"));
callBack.callBack(jsonObject.get("text"));
if(jsonObject.containsKey("timestamp"))
{
logger.info("timestamp: " + jsonObject.get("timestamp"));
}
} catch (org.json.simple.parser.ParseException e) {
e.printStackTrace();
}
if (iseof && mode.equals("offline") && !jsonObject.containsKey("is_final")) {
close();
}
if (iseof && mode.equals("offline") && jsonObject.containsKey("is_final") && jsonObject.get("is_final").equals("false")) {
close();
}
}
@Override
public void onClose(int code, String reason, boolean remote) {
logger.info(
"Connection closed by "
+ (remote ? "remote peer" : "us")
+ " Code: "
+ code
+ " Reason: "
+ reason);
}
@Override
public void onError(Exception ex) {
logger.info("ex: " + ex);
ex.printStackTrace();
}
}
View Code
调用工具类
<img id="code_img_closed_2af70ca2-29d6-46e7-92ec-d8fd7d13b2df"data-src="http://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif"><img id="code_img_opened_2af70ca2-29d6-46e7-92ec-d8fd7d13b2df" data-src="http://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif">
public static void main(String[] args) throws URISyntaxException {
String srvIp = "localhost";
String srvPort = "10095";
String wavPath = "D:\\developTest\\funasr-runtime-resources\\wav\\tmp_84677349854990998.wav";
Object lock = new Object();
StringBuffer text = new StringBuffer();
ExecutorService executor = Executors.newSingleThreadExecutor();
Future<?> future = executor.submit(()->{
try {
String wsAddress = "ws://" + srvIp + ":" + srvPort;
FunasrWsClient c = new FunasrWsClient(new URI(wsAddress),wavPath,new MyCallBack(){
@Override
public void callBack(Object obj){
text.append(obj.toString());
synchronized (lock){
try {
lock.notify();
}catch (Exception e){
e.printStackTrace();
}
}
}
});
synchronized (lock){
c.connect();
lock.wait();
}
}catch (Exception e){
// e.printStackTrace();
}
});
try {
future.get(10, TimeUnit.SECONDS);
System.out.println("规定时间内完成");
}catch (Exception e){
// e.printStackTrace();
System.out.println("任务超时");
text.append("任务超时");
}finally {
executor.shutdownNow(); // 取消任务并关闭线程池
}
System.out.println(text.toString());
}
View Code
<p> </p>
<p> </p>
<p> </p><br>来源:程序园用户自行投稿发布,如果侵权,请联系站长删除<br>免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作! yyds。多谢分享 新版吗?好像是停更了吧。 用心讨论,共获提升! 这个好,看起来很实用 懂技术并乐意极积无私分享的人越来越少。珍惜 收藏一下 不知道什么时候能用到 懂技术并乐意极积无私分享的人越来越少。珍惜 感谢分享,学习下。 yyds。多谢分享 感谢分享,下载保存了,貌似很强大 不错,里面软件多更新就更好了 收藏一下 不知道什么时候能用到 过来提前占个楼 感谢发布原创作品,程序园因你更精彩 鼓励转贴优秀软件安全工具和文档! 热心回复! 收藏一下 不知道什么时候能用到 前排留名,哈哈哈 新版吗?好像是停更了吧。
页:
[1]
2