调用PDF转Word接口 您所在的位置:网站首页 阿里云教程pdf 调用PDF转Word接口

调用PDF转Word接口

2023-08-19 13:08| 来源: 网络整理| 查看: 265

PDF转Word接口可以将单个PDF文档转换为可编辑的Word文档,精准识别文本内容,并保留原始文档的版面样式信息。

PDF转Word接口为异步接口,需要先调用PDF转Word异步提交服务SubmitConvertPdfToWordJob进行异步任务提交,然后调用文档转换结果查询服务GetDocumentConvertResult接口进行结果轮询,建议每10秒轮询一次,最多轮询10分钟,如果10分钟还未查询到处理完成结果,则视为处理超时。

当异步任务处理提交后,用户可以在处理结束后的24小时之内查询处理结果,超过24小时后将无法查询到处理结果。

步骤一:调用PDF转Word异步提交服务SubmitConvertPdfToWordJob接口请求参数

名称

类型

必填

描述

示例值

FileUrl

string

单个文档的URL(不支持1000页以上的PDF文件,不支持100MB以上的PDF文件)

如果需要本地上传文件方式,SDK会提供单独入参支持文件流上传。

https://example.com/example.pdf

FileName

string

文件名,需带文件类型后缀。该接口只支持PDF文件。

example.pdf

说明

支持的文档格式:单个pdf文件。

返回参数

名称

类型

描述

示例值

RequestId

string

请求唯一ID

43A29C77-405E-4CC0-BC55-EE694AD0****

Data

object

返回数据

{"Id": "docmind-20220712-b15f****"}

+Id

string

业务订单号,用于后续查询接口进行查询的唯一标识

docmind-20220712-b15f****

Code

string

状态码

200

Message

string

详细信息

message

示例

本接口支持本地文档上传和传入文档URL这两种调用方式。

以Java SDK为例,本地文档上传调用方式的请求示例代码如下,调用submitConvertPdfToWordJobAdvance接口,通过fileUrlObject参数实现本地文档上传。

说明

获取并使用AccessKey信息的方式,可参考SDK概述中不同语言的SDK使用指南。

import com.aliyun.docmind_api20220711.models.*; import com.aliyun.teaopenapi.models.Config; import com.aliyun.docmind_api20220711.Client; import com.aliyun.teautil.models.RuntimeOptions; import java.io.File; import java.io.FileInputStream; public static void submit() throws Exception { // 使用默认凭证初始化Credentials Client。 com.aliyun.credentials.Client credentialClient = new com.aliyun.credentials.Client(); Config config = new Config() // 通过credentials获取配置中的AccessKey ID .setAccessKeyId(credentialClient.getAccessKeyId()) // 通过credentials获取配置中的AccessKey Secret .setAccessKeySecret(credentialClient.getAccessKeySecret()); // 访问的域名,支持ipv4和ipv6两种方式,ipv6请使用docmind-api-dualstack.cn-hangzhou.aliyuncs.com config.endpoint = "docmind-api.cn-hangzhou.aliyuncs.com"; Client client = new Client(config); // 创建RuntimeObject实例并设置运行参数 RuntimeOptions runtime = new RuntimeOptions(); SubmitConvertPdfToWordJobAdvanceRequest advanceRequest = new SubmitConvertPdfToWordJobAdvanceRequest(); File file = new File("D:\\example.pdf"); advanceRequest.fileUrlObject = new FileInputStream(file); advanceRequest.fileName = "example.pdf"; // 发起请求并处理应答或异常。 SubmitConvertPdfToWordJobResponse response = client.submitConvertPdfToWordJobAdvance(advanceRequest, runtime); }const Client = require('@alicloud/docmind-api20220711'); const Util = require('@alicloud/tea-util'); const fs = require('fs'); const getResult = async () => { // 使用默认凭证初始化Credentials Client const cred = new Credential.default(); const client = new Client.default({ // 访问的域名,支持ipv4和ipv6两种方式,ipv6请使用docmind-api-dualstack.cn-hangzhou.aliyuncs.com endpoint: 'docmind-api.cn-hangzhou.aliyuncs.com', // 通过credentials获取配置中的AccessKey ID accessKeyId: cred.credential.accessKeyId, // 通过credentials获取配置中的AccessKey Secret accessKeySecret: cred.credential.accessKeySecret, type: 'access_key', regionId: 'cn-hangzhou' }); const advanceRequest = new Client.SubmitConvertPdfToWordJobAdvanceRequest(); const file = fs.createReadStream('./example.pdf'); advanceRequest.fileUrlObject = file; advanceRequest.fileName = 'example.pdf'; const runtimeObject = new Util.RuntimeOptions({}); const response = await client.submitConvertPdfToWordJobAdvance(advanceRequest, runtimeObject); return response.body; }from alibabacloud_docmind_api20220711.client import Client as docmind_api20220711Client from alibabacloud_tea_openapi import models as open_api_models from alibabacloud_docmind_api20220711 import models as docmind_api20220711_models from alibabacloud_tea_util.client import Client as UtilClient from alibabacloud_tea_util import models as util_models from alibabacloud_credentials.client import Client as CredClient def submit_file(): cred=CredClient() config = open_api_models.Config( # 通过credentials获取配置中的AccessKey ID access_key_id=cred.get_access_key_id(), # 通过credentials获取配置中的AccessKey Secret access_key_secret=cred.get_access_key_secret() ) # 访问的域名 config.endpoint = f'docmind-api.cn-hangzhou.aliyuncs.com' client = docmind_api20220711Client(config) request = docmind_api20220711_models.SubmitConvertPdfToWordJobAdvanceRequest( # file_url_object : 本地文件流 file_url_object=open("./example.pdf", "rb"), # file_name :文件名称。名称必须包含文件类型 file_name='123.pdf', ) runtime = util_models.RuntimeOptions() try: # 复制代码运行请自行打印 API 的返回值 response = client.submit_convert_pdf_to_word_job_advance(request, runtime) # API返回值格式层级为 body -> data -> 具体属性。可根据业务需要打印相应的结果。如下示例为打印返回的业务id格式 # 获取属性值均以小写开头, print(response.body.data.id) except Exception as error: # 如有需要,请打印 error UtilClient.assert_as_string(error.message)using Newtonsoft.Json; using System; using System.Collections; using System.Collections.Generic; using System.IO; using System.Threading.Tasks; using Tea; using Tea.Utils; public static void SubmitFile() { // 使用默认凭证初始化Credentials Client。 var akCredential = new Aliyun.Credentials.Client(null); AlibabaCloud.OpenApiClient.Models.Config config = new AlibabaCloud.OpenApiClient.Models.Config { // 通过credentials获取配置中的AccessKey Secret AccessKeyId = akCredential.GetAccessKeyId(), // 通过credentials获取配置中的AccessKey Secret AccessKeySecret = akCredential.GetAccessKeySecret(), }; // 访问的域名 config.Endpoint = "docmind-api.cn-hangzhou.aliyuncs.com"; //需要安装额外的依赖库--> AlibabaCloud.DarabonbaStream AlibabaCloud.SDK.Docmind_api20220711.Client client = new AlibabaCloud.SDK.Docmind_api20220711.Client(config); Stream bodySyream = AlibabaCloud.DarabonbaStream.StreamUtil.ReadFromFilePath(""); AlibabaCloud.SDK.Docmind_api20220711.Models.SubmitConvertPdfToWordJobAdvanceRequest request = new AlibabaCloud.SDK.Docmind_api20220711.Models.SubmitConvertPdfToWordJobAdvanceRequest { FileUrlObject = bodySyream, FileName = "example.pdf" }; AlibabaCloud.TeaUtil.Models.RuntimeOptions runtime = new AlibabaCloud.TeaUtil.Models.RuntimeOptions(); try { // 复制代码运行请自行打印 API 的返回值 client.SubmitConvertPdfToWordJobAdvance(request, runtime); } catch (TeaException error) { // 如有需要,请打印 error AlibabaCloud.TeaUtil.Common.AssertAsString(error.Message); } catch (Exception _error) { TeaException error = new TeaException(new Dictionary { { "message", _error.Message } }); // 如有需要,请打印 error AlibabaCloud.TeaUtil.Common.AssertAsString(error.Message); } }import ( "fmt" "os" openClient "github.com/alibabacloud-go/darabonba-openapi/v2/client" "github.com/alibabacloud-go/docmind-api-20220711/client" "github.com/alibabacloud-go/tea-utils/v2/service" "github.com/aliyun/credentials-go/credentials" ) func submit(){ // 使用默认凭证初始化Credentials Client。 credential, err := credentials.NewCredential(nil) // 通过credentials获取配置中的AccessKey ID accessKeyId, err := credential.GetAccessKeyId() // 通过credentials获取配置中的AccessKey Secret accessKeySecret, err := credential.GetAccessKeySecret() // 访问的域名,支持ipv4和ipv6两种方式,ipv6请使用docmind-api-dualstack.cn-hangzhou.aliyuncs.com var endpoint string = "docmind-api.cn-hangzhou.aliyuncs.com" config := openClient.Config{AccessKeyId: accessKeyId, AccessKeySecret: accessKeySecret, Endpoint: &endpoint} // 初始化client cli, err := client.NewClient(&config) if err != nil { panic(err) } // 上传本地文档调用接口 filename := "D:\\example.pdf" f, err := os.Open(filename) if err != nil { panic(err) } // 初始化接口request request := client.SubmitConvertPdfToWordJobAdvanceRequest{ FileName: &filename, FileUrlObject: f, } // 创建RuntimeObject实例并设置运行参数 options := service.RuntimeOptions{} response, err := cli.SubmitConvertPdfToWordJobAdvance(&request, &options) if err != nil { panic(err) } // 打印结果 fmt.Println(response.Body.String()) }

以Java SDK为例,传入文档URL调用方式的请求示例代码如下,调用submitConvertPdfToWordJob接口,通过fileUrl参数实现传入文档URL。请注意,您传入的文档URL必须为公网可访问下载的公网URL地址,无跨域限制,URL不带特殊转义字符。

说明

获取并使用AccessKey信息的方式,可参考SDK概述中不同语言的SDK使用指南。

import com.aliyun.docmind_api20220711.models.*; import com.aliyun.teaopenapi.models.Config; import com.aliyun.docmind_api20220711.Client; public static void submit() throws Exception { // 使用默认凭证初始化Credentials Client。 com.aliyun.credentials.Client credentialClient = new com.aliyun.credentials.Client(); Config config = new Config() // 通过credentials获取配置中的AccessKey ID .setAccessKeyId(credentialClient.getAccessKeyId()) // 通过credentials获取配置中的AccessKey Secret .setAccessKeySecret(credentialClient.getAccessKeySecret()); // 访问的域名,支持ipv4和ipv6两种方式,ipv6请使用docmind-api-dualstack.cn-hangzhou.aliyuncs.com config.endpoint = "docmind-api.cn-hangzhou.aliyuncs.com"; Client client = new Client(config); SubmitConvertPdfToWordJobRequest request = new SubmitConvertPdfToWordJobRequest(); request.fileName = "example.pdf"; request.fileUrl = "https://example.com/example.pdf"; SubmitConvertPdfToWordJobResponse response = client.submitConvertPdfToWordJob(request); }const Client = require('@alicloud/docmind-api20220711'); const getResult = async () => { // 使用默认凭证初始化Credentials Client const cred = new Credential.default(); const client = new Client.default({ // 访问的域名,支持ipv4和ipv6两种方式,ipv6请使用docmind-api-dualstack.cn-hangzhou.aliyuncs.com endpoint: 'docmind-api.cn-hangzhou.aliyuncs.com', // 通过credentials获取配置中的AccessKey ID accessKeyId: cred.credential.accessKeyId, // 通过credentials获取配置中的AccessKey Secret accessKeySecret: cred.credential.accessKeySecret, type: 'access_key', regionId: 'cn-hangzhou' }); const request = new Client.SubmitConvertPdfToWordJobRequest(); request.fileName = 'example.pdf'; request.fileUrl = 'https://example.com/example.pdf'; const response = await client.submitConvertPdfToWordJob(request); return response.body; }from alibabacloud_docmind_api20220711.client import Client as docmind_api20220711Client from alibabacloud_tea_openapi import models as open_api_models from alibabacloud_docmind_api20220711 import models as docmind_api20220711_models from alibabacloud_tea_util.client import Client as UtilClient from alibabacloud_credentials.client import Client as CredClient def submit_url(): cred=CredClient() config = open_api_models.Config( # 通过credentials获取配置中的AccessKey ID access_key_id=cred.get_access_key_id(), # 通过credentials获取配置中的AccessKey Secret access_key_secret=cred.get_access_key_secret() ) # 访问的域名 config.endpoint = f'docmind-api.cn-hangzhou.aliyuncs.com' client = docmind_api20220711Client(config) request = docmind_api20220711_models.SubmitConvertPdfToWordJobRequest( # file_url : 文件url地址 file_url='https://example.com/example.pdf', # file_name :文件名称。名称必须包含文件类型 file_name='123.pdf', ) try: # 复制代码运行请自行打印 API 的返回值 response = client.submit_convert_pdf_to_word_job(request) # API返回值格式层级为 body -> data -> 具体属性。可根据业务需要打印相应的结果。如下示例为打印返回的业务id格式 # 获取属性值均以小写开头, print(response.body.data.id) except Exception as error: # 如有需要,请打印 error UtilClient.assert_as_string(error.message)using Newtonsoft.Json; using System; using System.Collections; using System.Collections.Generic; using System.IO; using System.Threading.Tasks; using Tea; using Tea.Utils; public static void SubmitUrl() { // 使用默认凭证初始化Credentials Client。 var akCredential = new Aliyun.Credentials.Client(null); AlibabaCloud.OpenApiClient.Models.Config config = new AlibabaCloud.OpenApiClient.Models.Config { // 通过credentials获取配置中的AccessKey Secret AccessKeyId = akCredential.GetAccessKeyId(), // 通过credentials获取配置中的AccessKey Secret AccessKeySecret = akCredential.GetAccessKeySecret(), }; // 访问的域名 config.Endpoint = "docmind-api.cn-hangzhou.aliyuncs.com"; AlibabaCloud.SDK.Docmind_api20220711.Client client = new AlibabaCloud.SDK.Docmind_api20220711.Client(config); AlibabaCloud.SDK.Docmind_api20220711.Models.SubmitConvertPdfToWordJobRequest request = new AlibabaCloud.SDK.Docmind_api20220711.Models.SubmitConvertPdfToWordJobRequest { FileUrl = "https://example.pdf", FileName = "example.pdf" }; try { // 复制代码运行请自行打印 API 的返回值 client.SubmitConvertPdfToWordJob(request); } catch (TeaException error) { // 如有需要,请打印 error AlibabaCloud.TeaUtil.Common.AssertAsString(error.Message); } catch (Exception _error) { TeaException error = new TeaException(new Dictionary { { "message", _error.Message } }); // 如有需要,请打印 error AlibabaCloud.TeaUtil.Common.AssertAsString(error.Message); } }import ( "fmt" openClient "github.com/alibabacloud-go/darabonba-openapi/v2/client" "github.com/alibabacloud-go/docmind-api-20220711/client" "github.com/aliyun/credentials-go/credentials" ) func submit(){ // 使用默认凭证初始化Credentials Client。 credential, err := credentials.NewCredential(nil) // 通过credentials获取配置中的AccessKey ID accessKeyId, err := credential.GetAccessKeyId() // 通过credentials获取配置中的AccessKey Secret accessKeySecret, err := credential.GetAccessKeySecret() // 访问的域名,支持ipv4和ipv6两种方式,ipv6请使用docmind-api-dualstack.cn-hangzhou.aliyuncs.com var endpoint string = "docmind-api.cn-hangzhou.aliyuncs.com" config := openClient.Config{AccessKeyId: accessKeyId, AccessKeySecret: accessKeySecret, Endpoint: &endpoint} // 初始化client cli, err := client.NewClient(&config) if err != nil { panic(err) } // 文件URL fileURL := "https://example.com/example.pdf" // 文件名 fileName := "example.pdf" // 初始化接口request request := client.SubmitConvertPdfToWordJobRequest{ FileUrl: &fileURL, FileName: &fileName, } response, err := cli.SubmitConvertPdfToWordJob(&request) if err != nil { panic(err) } // 打印结果 fmt.Println(response.Body.String()) }use AlibabaCloud\SDK\Docmindapi\V20220711\Docmindapi; use AlibabaCloud\SDK\Docmindapi\V20220711\Models\SubmitConvertPdfToWordJobRequest; use Darabonba\OpenApi\Models\Config; use AlibabaCloud\Tea\Utils\Utils\RuntimeOptions; use AlibabaCloud\Tea\Exception\TeaUnableRetryError; use AlibabaCloud\Credentials\Credential; // 使用默认凭证初始化Credentials Client。 $bearerToken = new Credential(); $config = new Config(); // 访问的域名,支持ipv4和ipv6两种方式,ipv6请使用docmind-api-dualstack.cn-hangzhou.aliyuncs.com $config->endpoint = "docmind-api.cn-hangzhou.aliyuncs.com"; // 通过credentials获取配置中的AccessKey ID $config->accessKeyId = $bearerToken->getCredential()->getAccessKeyId(); // 通过credentials获取配置中的AccessKey Secret $config->accessKeySecret = $bearerToken->getCredential()->getAccessKeySecret(); $config->type = "access_key"; $config->regionId = "cn-hangzhou"; $client = new Docmindapi($config); $request = new SubmitConvertPdfToWordJobRequest(); $runtime = new RuntimeOptions(); $runtime->maxIdleConns = 3; $runtime->connectTimeout = 10000; $runtime->readTimeout = 10000; $request->fileName = "example.pdf"; $request->fileUrl = "https://example.com/example.pdf"; try { $response = $client->submitConvertPdfToWordJob($request, $runtime); var_dump($response->toMap()); } catch (TeaUnableRetryError $e) { var_dump($e->getMessage()); var_dump($e->getErrorInfo()); var_dump($e->getLastException()); var_dump($e->getLastRequest()); }

正常返回示例

JSON格式

{ "RequestId": "43A29C77-405E-4CC0-BC55-EE694AD0****", "Data": { "Id": "docmind-20220712-b15f****" } }

步骤二:轮询文档转换结果查询服务GetDocumentConvertResult接口

调用查询接口的入参ID就是前面异步任务提交接口返回的出参ID,查询结果有处理中、处理成功、处理失败三种情况。建议每10秒轮询一次,最多轮询10分钟。若明确返回Completed为true或者超过轮询最大时间,则终止轮询。

请求参数

名称

类型

必填

描述

示例值

Id

string

需要查询的业务订单号,订单号从提交接口的返回结果中获取

docmind-20220712-b15f****

返回参数

名称

类型

描述

示例值

RequestId

string

请求唯一ID

43A29C77-405E-4CC0-BC55-EE694AD0****

Completed

boolean

异步任务是否处理完成,false表示任务仍在处理中,true代表任务处理完成,有处理成功或处理失败的明确结果

true

Status

string

异步任务处理完成的状态,最终处理结束后的状态。Success为处理成功,Fail为处理失败

Success

Data

object

文档转换的结果,具体参数详情见下表

是个列表

Code

string

状态码

200

Message

string

详细信息

message

Url

String

转换后的文件URL地址

https://example.com/example.pdf

Size

Integer

转换后的文件大小,单位为字节。

1345488

Type

String

转换后的文件后缀名,pdf转word是docx

pdf

Md5

String

转换后的文件md5值,调用者可以用于校验

2d49eb7705f9f93ed857874db247****

以Java SDK为例,调用文档智能解析接口的结果查询类API示例代码如下,调用getDocumentConvertResult接口,通过ID参数传入查询流水号。

说明

获取并使用AccessKey信息的方式,可参考SDK概述中不同语言的SDK使用指南。

import com.aliyun.docmind_api20220711.models.*; import com.aliyun.teaopenapi.models.Config; import com.aliyun.docmind_api20220711.Client; public static void submit() throws Exception { // 使用默认凭证初始化Credentials Client。 com.aliyun.credentials.Client credentialClient = new com.aliyun.credentials.Client(); Config config = new Config() // 通过credentials获取配置中的AccessKey ID .setAccessKeyId(credentialClient.getAccessKeyId()) // 通过credentials获取配置中的AccessKey Secret .setAccessKeySecret(credentialClient.getAccessKeySecret()); // 访问的域名,支持ipv4和ipv6两种方式,ipv6请使用docmind-api-dualstack.cn-hangzhou.aliyuncs.com config.endpoint = "docmind-api.cn-hangzhou.aliyuncs.com"; Client client = new Client(config); GetDocumentConvertResultRequest resultRequest = new GetDocumentConvertResultRequest(); resultRequest.id = "docmind-20220902-824b****"; GetDocumentConvertResultResponse response = client.getDocumentConvertResult(resultRequest); }const Client = require('@alicloud/docmind-api20220711'); const getResult = async () => { // 使用默认凭证初始化Credentials Client const cred = new Credential.default(); const client = new Client.default({ // 访问的域名,支持ipv4和ipv6两种方式,ipv6请使用docmind-api-dualstack.cn-hangzhou.aliyuncs.com endpoint: 'docmind-api.cn-hangzhou.aliyuncs.com', // 通过credentials获取配置中的AccessKey ID accessKeyId: cred.credential.accessKeyId, // 通过credentials获取配置中的AccessKey Secret accessKeySecret: cred.credential.accessKeySecret, type: 'access_key', regionId: 'cn-hangzhou' }); const resultRequest = new Client.GetDocumentConvertResultRequest(); resultRequest.id = "docmind-20220902-824b****"; const response = await client.getDocumentConvertResult(resultRequest); return response.body; }from alibabacloud_docmind_api20220711.client import Client as docmind_api20220711Client from alibabacloud_tea_openapi import models as open_api_models from alibabacloud_docmind_api20220711 import models as docmind_api20220711_models from alibabacloud_tea_util.client import Client as UtilClient from alibabacloud_credentials.client import Client as CredClient def query(): cred=CredClient() config = open_api_models.Config( # 通过credentials获取配置中的AccessKey ID access_key_id=cred.get_access_key_id(), # 通过credentials获取配置中的AccessKey Secret access_key_secret=cred.get_access_key_secret() ) # 访问的域名 config.endpoint = f'docmind-api.cn-hangzhou.aliyuncs.com' client = docmind_api20220711Client(config) request = docmind_api20220711_models.GetDocumentConvertResultRequest( # id : 任务提交接口返回的id id='docmind-20220902-824b****' ) try: # 复制代码运行请自行打印 API 的返回值 response = client.get_document_convert_result(request) # API返回值格式层级为 body -> data -> 具体属性。可根据业务需要打印相应的结果。获取属性值均以小写开头 # 获取异步任务处理情况,可根据response.body.completed判断是否需要继续轮询结果 print(response.body.completed) # 获取返回结果。建议先把response.body.data转成json,然后再从json里面取具体需要的值。 print(response.body.data) except Exception as error: # 如有需要,请打印 error UtilClient.assert_as_string(error.message)using Newtonsoft.Json; using System; using System.Collections; using System.Collections.Generic; using System.IO; using System.Threading.Tasks; using Tea; using Tea.Utils; public static void GetResult() { // 使用默认凭证初始化Credentials Client。 var akCredential = new Aliyun.Credentials.Client(null); AlibabaCloud.OpenApiClient.Models.Config config = new AlibabaCloud.OpenApiClient.Models.Config { // 通过credentials获取配置中的AccessKey Secret AccessKeyId = akCredential.GetAccessKeyId(), // 通过credentials获取配置中的AccessKey Secret AccessKeySecret = akCredential.GetAccessKeySecret(), }; // 访问的域名 config.Endpoint = "docmind-api.cn-hangzhou.aliyuncs.com"; AlibabaCloud.SDK.Docmind_api20220711.Client client = new AlibabaCloud.SDK.Docmind_api20220711.Client(config); AlibabaCloud.SDK.Docmind_api20220711.Models.GetDocumentConvertResultRequest request = new AlibabaCloud.SDK.Docmind_api20220711.Models.GetDocumentConvertResultRequest { Id = "docmind-20220902-824b****" }; AlibabaCloud.TeaUtil.Models.RuntimeOptions runtime = new AlibabaCloud.TeaUtil.Models.RuntimeOptions(); try { // 复制代码运行请自行打印 API 的返回值 client.GetDocumentConvertResult(request); } catch (TeaException error) { // 如有需要,请打印 error AlibabaCloud.TeaUtil.Common.AssertAsString(error.Message); } catch (Exception _error) { TeaException error = new TeaException(new Dictionary { { "message", _error.Message } }); // 如有需要,请打印 error AlibabaCloud.TeaUtil.Common.AssertAsString(error.Message); } }import ( "fmt" openClient "github.com/alibabacloud-go/darabonba-openapi/v2/client" "github.com/alibabacloud-go/docmind-api-20220711/client" "github.com/aliyun/credentials-go/credentials" ) func submit(){ // 使用默认凭证初始化Credentials Client。 credential, err := credentials.NewCredential(nil) // 通过credentials获取配置中的AccessKey ID accessKeyId, err := credential.GetAccessKeyId() // 通过credentials获取配置中的AccessKey Secret accessKeySecret, err := credential.GetAccessKeySecret() // 访问的域名,支持ipv4和ipv6两种方式,ipv6请使用docmind-api-dualstack.cn-hangzhou.aliyuncs.com var endpoint string = "docmind-api.cn-hangzhou.aliyuncs.com" config := openClient.Config{AccessKeyId: accessKeyId, AccessKeySecret: accessKeySecret, Endpoint: &endpoint} // 初始化client cli, err := client.NewClient(&config) if err != nil { panic(err) } id := "docmind-20220925-76b1****" // 调用查询接口 request := client.GetDocumentConvertResultRequest{Id: &id} response, err := cli.GetDocumentConvertResult(&request) if err != nil { panic(err) } // 打印查询结果 fmt.Println(response.Body) //注意,使用go语言sdk会将返回结果中Url中的特殊字符&做Unicode转码,转码成\u0026,需要调用者手动将\u0026进行Unicode转码回&,才可以正常下载URL //go语言sdk的Url返回示例如下http://docmind-api-cn-hangzhou.oss-cn-hangzhou.aliyuncs.com/convert/docmind-20220927-87ad**/example.pdf?Expires=XX\u0026OSSAccessKeyId=YY\u0026Signature=ZZ }use AlibabaCloud\SDK\Docmindapi\V20220711\Docmindapi; use AlibabaCloud\SDK\Docmindapi\V20220711\Models\GetDocumentConvertResultRequest; use Darabonba\OpenApi\Models\Config; use AlibabaCloud\Tea\Utils\Utils\RuntimeOptions; use AlibabaCloud\Tea\Exception\TeaUnableRetryError; use AlibabaCloud\Credentials\Credential; // 使用默认凭证初始化Credentials Client。 $bearerToken = new Credential(); $config = new Config(); // 访问的域名,支持ipv4和ipv6两种方式,ipv6请使用docmind-api-dualstack.cn-hangzhou.aliyuncs.com $config->endpoint = "docmind-api.cn-hangzhou.aliyuncs.com"; // 通过credentials获取配置中的AccessKey ID $config->accessKeyId = $bearerToken->getCredential()->getAccessKeyId(); // 通过credentials获取配置中的AccessKey Secret $config->accessKeySecret = $bearerToken->getCredential()->getAccessKeySecret(); $config->type = "access_key"; $config->regionId = "cn-hangzhou"; $client = new Docmindapi($config); $request = new GetDocumentConvertResultRequest(); $request->id = "docmind-20220902-824b****"; $runtime = new RuntimeOptions(); $runtime->maxIdleConns = 3; $runtime->connectTimeout = 10000; $runtime->readTimeout = 10000; try { $response = $client->getDocumentConvertResult($request, $runtime); var_dump($response->toMap()); } catch (TeaUnableRetryError $e) { var_dump($e->getMessage()); var_dump($e->getErrorInfo()); var_dump($e->getLastException()); var_dump($e->getLastRequest()); }

查询结果有处理中、处理成功、处理失败三种情况,分别说明每种情况的返回结果示例。

处理中的返回结果如下所示:

{ "RequestId": "2AABD2C2-D24F-12F7-875D-683A27C3****", "Completed": false, "Code": "DocProcessing", "Message": "Document processing", "HostId": "ocr-api.cn-hangzhou.aliyuncs.com", "Recommend": "https://next.api.aliyun.com/troubleshoot?q=DocProcessing&product=docmind-api" }

处理中Completed会返回false,表示任务没有处理结束,仍在处理中。这种情况需要继续轮询,直到明确返回Completed为true或者超过轮询最大时间。

处理失败的返回结果如下所示:

{ "RequestId": "A8EF3A36-1380-1116-A39E-B377BE27****", "Completed": true, "Status": "Fail", "Code": "UrlNotLegal", "Message": "Failed to process the document. The document url you provided is not legal.", "HostId": "docmind-api.cn-hangzhou.aliyuncs.com", "Recommend": "https://next.api.aliyun.com/troubleshoot?q=IDP.UrlNotLegal&product=docmind-api" }

处理失败Completed会返回true,表示任务处理结束,同时会返回Status为字符串的Fail,表示处理成功失败,同时会返回失败Code和详细原因Message。访问错误码可以查看错误码详细介绍。

处理成功的返回结果如下所示:

{ "Status": "Success", "RequestId": "73134E1A-E281-1B2C-A105-D0ECFE2D****", "Completed": true, "Data": [ { "Type": "docx", "Size": 7940, "Url": "http://docmind-api-cn-hangzhou.oss-cn-hangzhou.aliyuncs.com/convert/docmind-20220902-57de****/example.docx?Expires=1662190830&OSSAccessKeyId=XX&Signature=YY", "Md5": "36256a4c2db8e31e056aeeb8b871****" } ] }

处理成功Completed会返回true,表示任务处理结束,同时会返回Status为字符串的Success,表示处理成功。具体的处理结果在Data节点中,Data节点是个列表,该节点只会有1个元素,接下来介绍下Data节点中每个元素的具体格式:

Type表示转换后文档的文档类型,例如该pdf转word接口返回的是docx格式的word文档。

URL表示转换后文档的下载链接,每次查询请求返回的URL下载链接的有效期为60分钟,超期后该URL会失效,需要重新调用查询接口拿到新的URL下载链接。

Size表示转换后文档的文件大小,单位是字节,可以用于对用户下载的转换后文档做合法性校验,比较下载的转换后文档的文件大小和接口返回的Size是否相等。

Md5表示转换后文档的文件Md5摘要值,可以用于对用户下载的转换后文档做合法性校验,比较下载的转换后文档的Md5摘要值和接口返回的Md5是否相等。



【本文地址】

公司简介

联系我们

今日新闻

    推荐新闻

    专题文章
      CopyRight 2018-2019 实验室设备网 版权所有