Project

General

Profile

action #743

find a tool to convert mediawiki to pdf

Added by lnussel almost 7 years ago. Updated almost 7 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Development
Target version:
Start date:
2013-09-02
Due date:
2013-10-25
% Done:

0%

Estimated time:
Duration: 40

Description

Jos needs an easy way to get from the features page in the opensuse wiki to a nice pdf document for the press.
It should be possible to retrieve the raw mediawiki source and convert to a markup language that is understood by office programs.

https://en.opensuse.org/index.php?title=openSUSE:Major_features&action=raw

test.html (91.1 KB) test.html asciidoc output -miska-, 2013-09-03 06:57
test.odt (37.7 KB) test.odt Libre Office output -miska-, 2013-09-03 06:57
Feature Guide.pdf (501 KB) Feature Guide.pdf That's what it should be... Anonymous, 2013-09-06 13:46
copy-paste.odt (1.13 MB) copy-paste.odt plain copy-paste into LO. Beats the script output... Anonymous, 2013-09-06 13:46
index.pdf (210 KB) index.pdf deplate generated file (after manual tweaking) alarrosa, 2013-10-22 16:33

History

#1 Updated by -miska- almost 7 years ago

Results of my first tests with pandoc:

  • pictures are screwed up
  • I prefer asciidoc output over the LibreOffice ** with little bit of sed, we should be able to fix that one...

#2 Updated by toscalix almost 7 years ago

  • Status changed from New to In Progress

Michal, when you finish this task, assign it to Jos so he tries it.

#3 Updated by -miska- almost 7 years ago

  • Status changed from In Progress to Feedback
  • Assignee changed from -miska- to Anonymous

Does either of it looks at least a little bit helpful? Or should I continue trying? Afaik, pictures will have to be redone manually anyway, but looks like LO stuff uses styles :-/

#4 Updated by Anonymous almost 7 years ago

If I have to do the pictures by hand, I gain little over copy-pasting it all by hand into LibreOffice... At least then I get the images, even though I still have to re-arrange many of them and clean up lots of the text/formatting. See the plain copy-paste in attached odt, imho better than what you created (but still a lot of work to get to attachment two, the final pdf).

If this takes more than an hour now it's not worth it, however - within 1-2 hours I can turn the wiki into this document, so I'll do it by hand. Unless you think the script can handle images too I think we should just close this.

#5 Updated by -miska- almost 7 years ago

  • Status changed from Feedback to Rejected

Images formating/allignement is difficult to handle automatically.

#6 Updated by lnussel almost 7 years ago

  • Due date set to 2013-10-07
  • Status changed from Rejected to New
  • Target version changed from 13.1 Beta 1 to 13.1 RC1

#7 Updated by -miska- almost 7 years ago

No update, haven't tried anything new, still think that if we need it to look pretty with pictures formating, manual interaction will be needed.

#8 Updated by lnussel almost 7 years ago

  • Due date changed from 2013-10-07 to 2013-10-16
  • Target version changed from 13.1 RC1 to 13.1 RC2

Maybe those tools can do it. Worth a try IMO.

#9 Updated by lnussel almost 7 years ago

Note this is not just useful for this time's release, it might be useful for documentation in general. Some things make sense to have in the wiki as well as pdf.

#10 Updated by lnussel almost 7 years ago

  • Due date changed from 2013-10-16 to 2013-10-25
  • Assignee changed from -miska- to alarrosa

#11 Updated by alarrosa almost 7 years ago

After installing haskell and a few other packages to try wb2pdf from sources, there are some dependencies that are not available in suse. The author provides only windows and ubuntu binaries, and recommends that if you want to try wb2pdf in another distro, you should install virtualbox with an ubuntu guest ... so I don't think it's worth the time installing it just to try it.
I'll try now javalatex and deplate ( http://deplate.sourceforge.net/ )

#12 Updated by alarrosa almost 7 years ago

The latex output of deplate gave errors when compiling it to a pdf, but after some manual tweaking adding the following lines to the header,

\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\DeclareUnicodeCharacter{00A0}{~}

I managed to make it generate a pdf. The awful attached pdf, to be more concrete.

#13 Updated by alarrosa almost 7 years ago

I couldn't manage to get a correct latex document generated with javaLatex even after spending some time trying to manually fix it, so I think I'll close this issue and if someone thinks it's worth to spend more time to find or create a tool to convert mediawiki format to pdf, just reopen it or create a new one.

#14 Updated by toscalix almost 7 years ago

  • Status changed from New to Rejected

Enough time spent already.

#15 Updated by toscalix almost 7 years ago

  • Status changed from Rejected to Closed

Also available in: Atom PDF