REPORTING SPECS

##################################
2020-09-25 OUTSTANDING REPORTING STUFF
This is stuff that may or may not be relevant in future, keeping here from my notes
REPORTING STUFF THAT IS FUTURE OR ON HOLD
MOVE THIS TO A v.next case or into reporting SPEC doc for future reference if not a coding issue now


todo: bizrule for report scale value at server
	and maybe other shit as well

todo: Need a setting that warns people about printing too much data, i.e. "That's a lot of data, are you sure you want to render that, it will be slow" or something to that effect


todo: REPORT BACKEND - report delete throws server exception about db context re-use or something to that effect 
	No exact steps top repro, but happens for Joyce with win64 release build if go in and edit same template a few times from widget list then attempt to delete it
	try to repro on debug build figure it out, could be a big issue if it's not specifically a report issue but wider biz object issue
	COULD NOT REPRO HERE

todo: (On hold pending testing) pdf options UI and passthrough
	OUTSTANDING
		Docs
			DOCS reference pages
			https://stackoverflow.com/questions/49943479/puppeteer-header-and-footertemplate-doesnt-work#49996999
			https://pptr.dev/#?product=Puppeteer&version=v5.3.0&show=api-pagepdfoptions
			todo: document about the troubleshooting section items here if applicable:
				https://jsreport.net/learn/chrome-pdf
				which may or may not apply in our case
		Testing	

	DONE
		basically need to be able to select every option and send it through
			options: 
				http://www.puppeteersharp.com/api/PuppeteerSharp.PdfOptions.html
				http://www.puppeteersharp.com/api/PuppeteerSharp.Page.html#PuppeteerSharp_Page_PdfAsync_System_String_PuppeteerSharp_PdfOptions_
				https://pptr.dev/#?product=Puppeteer&version=v5.3.0&show=api-pagepdfoptions
			page numbers control
			Test this, it might do what we need as it has a template for pdf footer and page number is part of it
				http://www.puppeteersharp.com/api/PuppeteerSharp.PdfOptions.html
			look at jsreport what do they include in their pdf post processing parameters and capabilities
			need to add pdfkit or whatever it's called at the front.

todo: I have console logging capture code now in backend, but it's doing nothing really, just logs if exception thrown
	it might be handy if it returned the log value and any other diagnostic info  with render return data when user is in designer
	i.e. have a further property Diagnostic bool and if set then returns diagnostic data
	and return property with pdf name would be in an object with additional properties for diagnosis etc
	But only if it proves helpful or necessary
	mainly this would allow a console log or error or trace to flow back to the user from the script being run at the server

//before getting into timeouts and shit make sure it's running as well as can be in docker
todo: look at guidance for running puppeteer (js) on alpine docker here: https://github.com/puppeteer/puppeteer/blob/main/docs/troubleshooting.md#running-on-alpine
	how jsreport is launching headless chrome, i.e. which settings and flags etc
	https://github.com/puppeteer/puppeteer/blob/main/docs/api.md#puppeteerlaunchoptions
	https://jsreport.net/learn/chrome-pdf
	https://github.com/jsreport/jsreport-chrome-pdf/blob/master/lib/conversion.js
	(after looking at it, it's still a bit unclear, maybe not relevant as they seem to do a lot differently for that)


todo: look over this: https://github.com/puppeteer/puppeteer/issues/1834

todo: REPORT BACKEND - more timeouts for report rendering, hard kill after 30 seconds tops but adjustable interval maybe?
	each process takes time in part, need to see which is taking which time the most and killable
	i.e. run a huge report and see which exact step takes which time for each
	we have the pre-render timing out maybe need more for each step where it's vulnerable to crashing / timing out
	
	 Devops droplet is overwhelmed by 10k records of widgets using the customfields example report
	symptom is super high cpu usage (100%) pegged and probably virtual memory usage as well
	timeout is kind of a hard core way to work around this issue, maybe instead it should be looking at excessive cpu and memory usage?
		The metrics don't catch it because it happens too quickly for the metrics lifecycle

	I want it to be able to handle it gracefully without crashing
	Looks like jsreports has this issue too and they hard kill the process I think if necessary, but they actually re-use the same instance for reporting I think.
	If it's not an issue then no need to resolve at the moment

	UPDATE: No way to timeout the pdf generation built into puppeteer sharp so far, maybe some workarounds
		I think the nuclear option would be to start timer once we know the process id of the chormium instance and that time automatically kills that process ID if still found around XX seconds later
		(or in the form of a clean up job maybe like the temp file deleter job)

todo: reporting load test
	test locally with 20k widgets, make it crash then determine what limits to set on it and properly return error when it's exceeded
	right now it just says something about puppeteer and "crash"
	NOT ABLE TO REPRO HERE



todo: MEMORY / PERFORMANCE / CRASHING
	BOTTOM LINE:
		It *sometimes* can't render at 5k records lots of pages on devops but mainly it does around 2.5 to 3 minutes avg, same on dev ws takes 1.5 m avg but no crash
		uses a lot of ram, also cpu does get pegged by chrome process at 95% so it's a bit of both I guess
		If users have issue with server during reporting they should check ram and cpu
		a minimal droplet appears fine for normal workload, it's the big report rendering that's the most inefficient

 look at proper settings for rendering in docker on linux and try to make it resilient to crashing even if set too low or too little resources
	IDEAS:
		check guidelines for running in docker from link below
		check command line params for best fit
		see about re-using same or whatever
		check cpu, memory usage upon render request, reject immediately if the system is overburdened
		check for running chromium processes or some way to determine if in the middle of the last render for someone

		Try a method to zap all chromium processes as a test to see if they *can* be killed whil stuck this way
		To test use the "custom date time format helpers" with 5k records, that reliably freezes everything even when others all seem to run ok
			(this in itself is curious but whatever)

	What I'm seeing is on a long render it returns a 504 gateway timeout to the client but it's still churning away in the background
		5k records and "custom date time format helpers" report will take about 3 minutes to render, maybe less and it will crash out for the client return 504 but then it will complete at some point
		Confirmed it *does* complete because I was able to download the 4.12mb 2144 page pdf manually from the server temp folder via filezilla once I saw the cpu go down again
			and chrome process stop
		90% memory is the max used in do graph panel, probably docker is not letting it take all or something isn't
			maybe swapping out is what's happening, peak memory usage seems to be half a gig or so but c
		97% cpu usage is max

		I'm thinking it's a memory issue more than a cpu issue because the cpu is hardly pegged at all while it's rendering right up until it appears to run out of memory
		then it starts swapping and all hell breaks loose and the cpu pegs on the swap daemon
		Need to test by moving up to 2gb of memory witha  resize but keep the single vcpu and see what's what
		also increase the timeout in nginx for the reverse proxy to wait 
		A temporary rendering caused memory shortage should not cause the system to come to a standstill

				
todo: memory usage and timeout is directly related to the amount of space taken up physically on the page
	it's NOT related to the helpers as just putting static text on the page causes teh same issue
	it's memory taken to render to pdf probably and a byte is a byte even if it's blank white page
	=-=-=-=-=-







########## ORIGINAL SPECS ###################
CASES
1734 - REPORTS:GRIDS: - grid filter name and summary of filter criteria available as fields to print on report


REQUIREMENTS

- All v7 reports ported to RAVEN
    - ALL Fields even the ones that don't show on the report but are available for adding to a report in the editor need to be available

	- REPORTS
			- Report object has following properties:
				- DataList name it's based off of
				- Required fields from DataList
				- Report template itself with it's own code and template requirements TBD			
			- Report columns returned: When user selects to show a report, client will fixup any missing columns from the datalistview currently in use
				- For example they are viewing a table based on a TestWidgetDataList DataListview with only 3 columns in it
				- They drop down the reports list which shows all reports based off TestWidgetDataList view
				- They select a report to print.
				- Report code looks at report's required fields from DatalistView and sees report uses 6 fields listed
				- Code compares report fields to in use DataListview fields and appends any report required fields missing from current view to the right of the collection in the current DataListview 
				- When report is run it will have all fields this way returned but will still be sorted and filtered by table view
			- As part of editing process user can select an existing datalistview to prime their report editing view
			- A report can be selected from any client table that is based on the same view
		

//=======================


USEFUL REPORTING RELATED LINKS

			https://github.com/jsreport/jsreport-core
			https://github.com/jsreport/jsreport-core/blob/master/lib/render/engineScript.js

			//actual render here
			https://github.com/jsreport/jsreport-chrome-pdf/blob/d3fe318aac3628d8cb62f86f8f71314f21745798/lib/conversion.js

			//PDF utils
			https://github.com/jsreport/jsreport-pdf-utils

			They use a Mozilla library called pdfjs and their utils are basically just wrappers around using it
			https://github.com/mozilla/pdf.js
			
			hub to docs here:
			https://mozilla.github.io/pdf.js/
			
	This is the jsreport designer libs used for reference: https://github.com/jsreport/jsreport-studio/blob/master/package.json
https://jsreport.net/learn/api

Report templates pre-designed and open source: https://github.com/wildbit/postmark-templates

HTML -> PDF
	JSREPORT has a comparison table of various html to pdf tools here:
		https://jsreport.net/learn/pdf-recipes


	Headless Chrome
		https://github.com/jsreport/jsreport-chrome-pdf
		FAST SPEED (according to jsreport docs)
		jsreport uses headless chrome by default which has built in pdf from html ability.
		they use a NODE library Puppeteer for it, but there is a c# wrapper for .net core linux windows mac: https://github.com/hardkoded/puppeteer-sharp
			some kind of example that may be relevant: https://github.com/kblok/netconfar-puppeteer-sharp-demo/blob/master/hacking-the-browser-api/Controllers/MediumController.cs#L13
			Maybe not the only one for c# core, need to dig around
		Issues:
			Issue with header / footer not being settable apparently which is a big breaking issue for many biz reports usage
			someone said that another pdf tool can be used to set those post processing but fuckery abounds
			other solutions below apparently don't have this issue.
			Even jsreport has listed workarounds and tools to resolve this
			Update: apparently there are ways:
				https://stackoverflow.com/questions/44575628/alter-the-default-header-footer-when-printing-to-pdf?noredirect=1&lq=1
				see last comment seems relevant, also other linked cases all mention various things.  Finally, could use a pdf writer tool to post process maybe.

			There seem to be many potential issues with missing libraries, rights and sandbox and etc etc etc on linux			
			These things kind of turned me off this a bit, it's not plug and play and simple
			

		LINKS:
			https://github.com/hardkoded/puppeteer-sharp
			https://github.com/puppeteer/puppeteer
			https://stackoverflow.com/search?q=puppeteer-sharp
			https://stackoverflow.com/questions/62042078/puppeteer-sharp-for-server-side-html-to-pdf-conversions
			https://www.singlestoneconsulting.com/blog/how-to-generate-server-side-pdf-reports-puppeteer-d3-handlebars/
			https://github.com/hardkoded/puppeteer-sharp/issues/1510 - shows being used by someone other than jsreport which buries all the details in endless libs
			https://github.com/hardkoded/puppeteer-sharp/issues/1514

	WeasyPrint
		https://github.com/jsreport/jsreport-weasyprint-pdf
		SLOWEST SPEED
		https://github.com/Kozea/WeasyPrint based on python, does it's own rendering doesn't rely on a web browser engine like the rest
		free, recommended by wkhtmltopdf author as an alternative
		May be slow, slower than the other options likely, has some installation steps that are a bit convoluted but ironically only for windows as it's included in package managers
		Very good support for modern css3 PAGE properties apparently
		WRAPPER
			https://github.com/balbarak/WeasyPrint-netcore/blob/master/src/Balbarak.WeasyPrint/WeasyPrintClient.cs


	wkhtmltopdf 
		https://github.com/jsreport/jsreport-wkhtmltopdf
		MEDIUM SPEED
		https://wkhtmltopdf.org/downloads.html
		Well used, old based on older webkit so doesn't support css3 but likely enough for our purposes
		has an easy installer for all platforms
		free
		Has warnings about how unsanitized html can take down a server or own it somehow
		WRAPPERS
			https://github.com/carloscds/HtmlToPDFCore/tree/master/HtmlToPDFCore  This one looks cool, all platforms supported includes binary possibly?
			https://blog.elmah.io/generate-a-pdf-from-asp-net-core-for-free/


HTML -> DOCX
	This is a possiblity that needs to be researched, instead of pdf go docx which is in theory multi platform and openable on other devices? Not sure
	https://github.com/jsreport/jsreport-docx
	https://github.com/EricWhiteDev/Open-Xml-PowerTools

HTML -> XLSX
	https://github.com/jsreport/jsreport-xlsx

HTML -> TEXT
	https://github.com/jsreport/jsreport-html-to-text
	https://github.com/jsreport/jsreport-text


TEMPLATE ENGINE
	https://github.com/jsreport/jsreport-handlebars
	Handlebars by default for jsreport which is easy peasy to work with

PDF META DATA EDITING
	https://github.com/jsreport/jsreport-pdf-meta

Render outputs
	JSReport renders to different outputs, they call it recipes https://jsreport.net/learn/recipes
	HTML - just outputs as html for viewing in the browser / printing from browser
	PDF: https://jsreport.net/learn/pdf-recipes
		Outputs 5 different pdf converters because they all support different feature sets which is ominous



BAR CODE STUFF
	Bar codes: https://github.com/metafloor/bwip-js 
		https://stackoverflow.com/questions/19017512/use-canvas-inside-a-handlebars-template
		https://github.com/metafloor/bwip-js#browser-usage
		https://www.scandit.com/blog/types-barcodes-choosing-right-barcode/
		https://github.com/metafloor/bwip-js/wiki/BWIPP-Barcode-Types

		 let opt = {
      bcid: "code128", // Barcode type
      text: "0123456789", // Text to encode
      scale: 3, // 3x scaling factor
      height: 10, // Bar height, in millimeters
      includetext: true, // Show human-readable text
      textxalign: "center" // Always good to set this
    };
