#!/usr/bin/perl 
use strict;
use warnings;
use Getopt::Long;
use Pod::Usage;
use Debian::LicenseReconcile::App;

# Command line options for documentation
my $man = 0;
my $help = 0;

# If this is set we don't actually report errors
# at all, but merely succeed or fail.
my $quiet = 0;

# If set output suggested DEP-5 Stanzas
my $suggest_stanzas = 0;

# TODO: abstract output to allow flexibility of reporting

# TODO: What's this again?
my $display_mapping = 0;
my $check_copyright = 1;
my $format_spec = 1;

# Ordered list of filters used to process the code
# Rules:	config driven set of rules
# ChangeLog:	derive data for debian directory from changelog
# Std:		derive data from output of licenseheck
# Shebang:	look for shebanged files and apply licensecheck to them
# Default:	last chance rules, Rules code renamed to Default
my @filters = qw(Rules ChangeLog Std Shebang Default~Rules);
my @filters_override = ();

# Default location of various files
my $config_file = 'debian/license-reconcile.yml';
my $changelog_file = 'debian/changelog';
my $directory = ".";
my $copyright_file = 'debian/copyright';

GetOptions(
    'help|?' => \$help,
    man => \$man,
    'copyright-file=s' => \$copyright_file,
    'format-spec!' => \$format_spec,
    'quiet!' => \$quiet,
    'display-mapping!' => \$display_mapping,
    'directory=s' => \$directory,
    'config-file=s' => \$config_file,
    'changelog-file=s' => \$changelog_file,
    'check-copyright!' => \$check_copyright,
    'suggest-stanzas!' => \$suggest_stanzas,
    'filters=s@' => sub {
        shift; # name of option
        my $value = shift;
        if (not grep {$value eq $_} @filters_override) {
            push @filters_override, $value;
        }
    },
) or pod2usage(2);
pod2usage(1) if $help;
pod2usage(-exitstatus => 0, -verbose => 2) if $man;

if (@filters_override) {
    @filters = @filters_override;
}

my $files = _package_args(@ARGV);

my $app = Debian::LicenseReconcile::App->new(
    copyright           => $copyright_file,
    check_copyright     => $check_copyright,
    changelog_file      => $changelog_file,
    format_spec         => $format_spec,
    config_file         => $config_file,
    directory           => $directory,
    quiet               => $quiet,
    display_mapping     => $display_mapping,
    filters             => \@filters,
    format_spec         => $format_spec,
    files		=> $files,
    suggest_stanzas	=> $suggest_stanzas,
);
exit($app->run);

sub _package_args {
    my @files = @_;
    my $result = undef;
    if (@files) {
	%$result = map {$_ => 1} @files;
    }
    return $result;
}

=head1 NAME

license-reconcile - reconcile debian/copyright against source

=head1 SYNOPSIS

B<license-reconcile> B<--help>|B<--man>

B<license-reconcile> [B<--copyright-file=>I<file>] [B<--no-check-copyright>] [B<--suggest-stanzas>] [B<--no-format-spec>] [B<--quiet>] [B<--display-mapping>] [B<--directory=>I<directory>] [B<--filters=>I<module1 module2 ...>] [B<--config-file=>I<file>] [B<--changelog-file=>I<file>] [I<files...>]

=head1 DESCRIPTION

B<license-reconcile> attempts to match license and copyright information
in a directory with the information available in C<debian/copyright>.
It gets most of its data from C<licensecheck> so should produce something
worth looking at out of the box.
However for a given package it can be configured to succeed in a known good
state, so that if on subsequent upstream updates it fails, it points out
what needs looking at.
By default the tests run are as follows:

=over

=item - Does the copyright file have an approved format specification as its
first line?

=item - Can the copyright file be parsed?

=item - Does every file in the source match at least one clause in the copyright
file?

=item - Can every file, license and copyright datum extracted from the source be
contained in the corresponding matching paragraph from the copyright file?
The data for this comparison comes from a number filter objects. See L</Filters>
for more information.

=item - Is every file in the source assigned copyright and a license by some
part of the C<debian/copyright> file.

=back

=head1 GETTING STARTED

=head2 out of the box

From the top level of the source directory of Debian packaged software,
just run C<license-reconcile>.

=head2 setting a config file

Normally to make any progress it will be necessary to have a config file.
The default file is C<debian/license-reconcile.yml>. A different config
file can be set with the B<--config-file=>I<file>. The config file is
interpreted using L<Config::Any> but for the purposes of this documentation#
we assume the format is L<YAML>.

=head2 overriding incorrect results

Suppose you are really lucky. For just one file, C<a/b>, the default filters
which are wrappers around C<licensecheck>, have got it wrong. They have for
some reason decided that the file has a GPL-3 license, when inspection shows it is
in the public domain. This is causing a false positive break against your carefully
crafted C<debian/copyright> file. You can fix this with the following
config fragment:

 Rules:
  rules:
   -
    Glob: a/b
    License: public-domain
    Copyright: 1556, Nostrodamus

See L<Debian::LicenseReconcile::Filter::Rules> for more information on how
to configure this filter.

=head2 providing a catch all license and copyright

You can make the filters provide a default license, but providing a suitable
rule in the Default section of the config file:

 Default:
  rules:
   -
    License: All software is property of the proletariat license
    Copyright: 1984, Ministry of Algorithms

The Default filter uses exactly the same code as the Rules filter, but by default
runs last. So it has all the same functionality but the lowest precedence.

=head2 controlling the sequence of filters.

By default the filters run are: Rules, Std, Shebang, ChangeLog and Default.
You can vary the filters using the B<--filters=>I<module> option. Setting
C<--filters Rules> would mean that only the
L<Debian::LicenseReconcile::Filter::Rules> filter would be used. Once you
specify one filter you must specify them all.

=head2 filter aliasing

The Default filter is an alias for Rules. This means it runs the same code
but has a separate config. Default is defined as "Default~Rules". In general
"X~Y" means use the code from Y but get the config from X.

=head2 writing your own filter.

You can write your own filters by inheriting from
L<Debian::LicenseReconcile::Filter>. You need to define the C<get_info>
method.

=head1 OPTIONS

=head2 B<--copyright-file=>I<file>

Specify an alternative copyright file. Defaults to C<debian/copyright>.

=head2 B<--no-format-spec>

Don't check the first line of the copyright file against permitted format
specifications.

=head2 B<--no-check-copyright>

Don't check the copyright clauses.

=head2 B<--quiet>

Don't give any explanations, simply a success or a fail via the exit status.

=head2 B<--display-mapping>

Display mapping from the directory onto the copyright clauses.

=head2 B<--directory=>I<directory>

The directory whose copyright and licenses will be verified. This defaults to ".".

=head2 B<--filters=>I<module1>  B<--filters=>I<module2> ....

A sequence of filters which will inspect the source package and return
license and copyright information. Each module name must sit below the
L<Debian::LicenseReconcile::Filter> and inherit from it. The default value 
is "Rules Std Shebang ChangeLog Default".

=head2 B<--config-file=>I<file>

A file used to provide filter specific configuration data. The file is read
by L<Config::Any> and the relevant section is passed to each filter constructor
via the C<config> parameter.

=head2 <--changelog-file=>I<file>

The Debian changelog file which defaults to C<debian/changelog>. The Rules filter
uses this to get the current version and the ChangeLog filter gets its data
from it.

=head2 B<--suggest-stanzas>

If set print out the license and copyright data in DEP-5 format.

=head1 Filters

By default the filters are processed in the order below. Once a file has been
returned by a filter, subsequent filters will ignore it.

=over

=item - L<Rules|Debian::LicenseReconcile::Filter::Rules>

=item - L<ChangeLog|Debian::LicenseReconcile::Filter::ChangeLog>

=item - L<Std|Debian::LicenseReconcile::Filter::Std>

=item - L<Shebang|Debian::LicenseReconcile::Filter::Shebang>

=item - L<Default|Debian::LicenseReconcile::Filter::Default>

=back

Each filter constructor will be passed the following parameters:

=over

=item - directory - the directory from which to find license and copyright data.

=item - files_remaining - an array ref of files which have not been analyzed.

=item - config - a data structure representing the portion of the config file
relevant to this filter.

=item - changelog - a L<Parse::DebianChangelog> object.

=item - licensecheck - a L<Debian::LicenseReconcile::LicenseCheck> object.

=back

=head1 FILE ARGUMENTS

Any arguments after the arguments are assumed to be files. If specified only
these files will be reconciled.

=head1 LIMITATIONS

The DEP-5 specification is subtly different from the file glob specification.
Since the L<File::FnMatch> module is the only practical implementation there
is little that can be done. The consequence is that attempting to specify that a
file name should contain '[' and later a ']' in C<debian/copyright> is unlikely
to work correctly.

In copyright parsing years cannot be expressed in an abbreviated two digit form.
This is probably a good thing, but it it will surely cause an issue at some point.

=head1 EXAMPLES

Two Debian projects are currently using L<license-reconcile>, to verify the
C<debian/copyright> file.

=over

=item B<license-reconcile|https://anonscm.debian.org/cgit/pkg-perl/packages/license-reconcile.git/tree/debian/license-reconcile.yml> must obviously be clean by its own standards. At some 
point this check will be added to the build tests.

=item B<ksh|http://anonscm.debian.org/cgit/collab-maint/ksh.git/tree/debian/license-reconcile.yml> 
is using license-reconcile since version 93u+20120801-2.

=back

=head1 AUTHOR

Nicholas Bamber, C<< <nicholas at periapt.co.uk> >>

=head1 LICENSE AND COPYRIGHT

Copyright 2012, 2015, Nicholas Bamber.

This program is free software; you can redistribute it and/or modify it
under the terms of either: the GNU General Public License as published
by the Free Software Foundation; or the Artistic License.

See http://dev.perl.org/licenses/ for more information.

=cut
