javaPOI实现word转html(doc、docx)word内容读取为html
背景: 需要将word内容导⼊到富⽂本,⼯具类包含了doc和docx⽂件读取。
⼯具类 WordToHtml.java:
import ImageManager;
import XHTMLConverter;
import XHTMLOptions;
import HWPFDocument;
import PicturesManager;
import WordToHtmlConverter;
import Picture;
import PictureType;
import XWPFDocument;
import Value;
import Component;
import MultipartFile;
import Document;
import DocumentBuilderFactory;
import OutputKeys;
import Transformer;
import TransformerFactory;
import DOMSource;
import StreamResult;
import*;
import List;
/**
* @description word转html
*/
@Component
public class WordToHtml {
//图⽚保存⽬录
@Value("${word.pic.save.path}")
private String picPath;
/**
* @param file 待转换的⽂件
* @return java.lang.String
* @description 对⽂件进⾏word转换成html字符串返回
*/
public String readeWordToHtml(MultipartFile file){
// 需要判断⽂件是否为doc,docx
if(file ==null){
return"";
}
String suffix = OriginalFilename().OriginalFilename().lastIndexOf(".")+1);
// 配置服务器访问体制
String picViewPath ="127.0.0.1:8761/server/dietc/source/view/word/pic/";html document是什么
try{
if(suffix.equals("doc")|| suffix.equals("DOC")){
HWPFDocument wordDocument =new InputStream());
WordToHtmlConverter wordToHtmlConverter =new WordToHtmlConverter(
.newDocument());
wordToHtmlConverter.setPicturesManager(new PicturesManager(){
@Override
public String savePicture(byte[] content,
PictureType pictureType, String suggestedName,
float widthInches,float heightInches){
return picViewPath + suggestedName;
return picViewPath + suggestedName;
}
});
wordToHtmlConverter.processDocument(wordDocument);
//save pictures
List pics = PicturesTable().getAllPictures();
if(pics !=null){
for(int i =0; i < pics.size(); i++){
Picture pic =(Picture) (i);
try{
pic.writeImageContent(new FileOutputStream(new File(picPath
+ pic.suggestFullFileName())));
}catch(FileNotFoundException e){
e.printStackTrace();
}
}
}
Document htmlDocument = Document();
ByteArrayOutputStream out =new ByteArrayOutputStream();
DOMSource domSource =new DOMSource(htmlDocument);
StreamResult streamResult =new StreamResult(out);
TransformerFactory tf = wInstance();
Transformer serializer = tf.newTransformer();
serializer.setOutputProperty(OutputKeys.ENCODING,"UTF-8");
serializer.setOutputProperty(OutputKeys.INDENT,"yes");
serializer.setOutputProperty(OutputKeys.METHOD,"html");
String result =new ByteArray()).replaceAll("↵","");
out.close();
return result;
}else if(suffix.equals("docx")|| suffix.equals("DOCX")){
XWPFDocument document =new InputStream());                XHTMLOptions options = ate();
//图⽚提取
//图⽚路径
ImageManager imageManager =new ImageManager(new File(picPath),"");                options.setIgnoreStylesIfUnused(false);
options.setFragment(true);
options.setImageManager(imageManager);
// 3) 将 XWPFDocument转换成XHTML
ByteArrayOutputStream out =new ByteArrayOutputStream();
String result =new ByteArray());
out.close();
placeAll("<img src=\"","<img src=\""+ picViewPath);
}else{
return"请上传.doc或者.docx⽂件";
}
}catch(Exception e){
e.printStackTrace();
System.out.println("⽂件格式错误!");
return"⽂件格式错误!";
}
}
}
感谢阅读,有问题欢迎留⾔,看到第⼀时间回复!(*^_^*)

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。